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PREFACE 

This  manual  has  "been  the  outgrowth  of  a  set  of  lectures  on  Field  Plot  Technique 
given  to  seniors  and  graduate  students  at  Colorado  State  College  since  1930*  It  has 
heen  found  practical  in  the  classroom  for  a  2  to  k   credit  combined  lecture  and  labor- 
atory course.  The  problems  and  questions  have  proved  to  be  important  aids  to  the 
student.  While  "Field  Plot  Technique"  has  been  prepared  primarily  for  clas3  use,  it 
is  hoped  that  it  will  appeal  to  the  technical  worker  in  Agronomy  as  a  reference  to 
the  more  important  statistical  methods  and  tables.  The  large  number  of  references 
quoted  will  give  the  reader  a  ready  reference  to  the  major  papers  on  various  phases 
of  applied  statistics. 

The  organization  of  the  subject  matter,  and  the  manner  in  which  the  statistical 
methods  are  interwoven  with  the  applications,  differs  somewhat  from  the  conventional 
approach.  The  writers  feel  that  the  student  of  agronomic  experimentation  needs  an 
elementary  picture  of  the  factors  to  be  considered  in  a  research  program  with  special 
reference  to  the  field  experiment.  For  this  reason,  an  attempt  has  been  made  to 
coordinate  the  historical  and  logical  background  of  agronomic  experimentation  with 
statistical  techniques  and  their  application  to  the  design  of  the  practical  types  of 
field  experiments.  This  also  requires  that  the  student  be  familiar  with  the  mechan- 
ical procedures  generally  followed  in  routine  experimental  work. 

The  development  of  the  various  statistical  techniques  has  been  intuitive  rather 
than  rigorously  mathematical.  The  aim  has  been  to  lead  the  student  to  understand 
the  formulas  he  applies  without  necessarily  being  able  to  derive  them  mathematically. 
The  symbolism  employed  in  the  text  was  chosen  with  regard  to  what  appears  to  be  the 
most  common  usage.  Considerable  effort  has  been  spent  in  striving  for  consistency. 

That  it  is  impossible  in  an  elementary  text  to  present  and  interpret  many  of 
the  complexities  involved  in  some  modern  experiments  is  obvious.  It  is  hoped  that 
a  sufficient  foundation  will  be  laid  for  the  student  so  that  he  can  intelligently 
study  the  more  advanced  treatises. 

The  writers  are  deeply  indebted  to  Dr.  F.  R.  Immer,  Professor  of  Agronomy  and 
Plant  Genetics,  University  of  Minnesota,  for  permission  to  make  liberal  use  of  his 
classroom  material,  especially  in  chapters  11,  17,  and  18.  They  wish  to  express 
their  appreciation  to  Dr.  S.  C.  Salmon,  Division  of  Cereal  Crops  and  Diseases,  U.  S. 
Department  of  Agriculture,  for  criticisms  and  helpful  suggestions.  Dr.  K.  S.  Quisen- 
berry  of  the  same  division  has  assisted  by  his  criticisms  of  chapter  21.  The  wri- 
'  ters  are  particularly  grateful  to  Professor  R.  A.  Fisher  and  his  publishers,  Oliver 
and  Boyd,  for  permission  to  reproduce  the  Table  ofx2  from  "Statistical  Methods  for 
Research  Workers."  Professor  G.  W.  Snedecor,  Iowa  State  College,  generously  allowed 
us  to  include  his  table  of  "F  and  t".  The  writers  also  wish  to  express  their  thanks 
to  Dr.  C.  I.  Bliss  for  permission  to  use  his  table  of  angular  transformations.  The 
table  of  Weparian  logarithms  used  in  the  manual  is  taken  from  "Four  Figure  Mathemati- 
cal Tables"  by  the  late  J.  T.  Bottomley  and  published  by  Macmillan  and  Co.,  Ltd. 
(London) .  The  writers  are  grateful  to  the  publishers  and  to  the  representatives  of 
the  author  for  permission  to  use  this  table.  To  Dr.  D.  W.  Robertson,  one  of  their 
colleagues,  they  express  their  appreciation  for  various  helpful  suggestions. 
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Part  I 


Introduction  to  Experimentation 


CHAPTER  I 
STATUS  OF  AGRONOMIC  RESEARCH 

I.  Rise  of  Agronomic  Research 

Although  the  art  of  agronomy  has  "been  practiced  for  centuries,  the  science  of  agrono- 
my is  only  about  100  years  old.  The  need  for  reliable  information  in  this  country 
has  come  about  gradually  as  farmers  have  come  to  realize  some  of  the  many  problems 
which  confront  the  agricultural  industry,  problems  in  soil  fertility,  the  control  of 
diseases  and  pests,  winter -hardiness  in  crops,  among  many  others.  In  addition  to 
the  needs  of  the  farmers  themselves,  the  establishment  of  the  Land -Grant  Colleges 
under  the  Morrill  Act  in  1862  brought  about  an  acute  need  for  subject  matter  for  the 
agricultural  colleges .  It  soon  became  very  apparent  that  the  problems  in  agriculture 
were  complex  and  that  well -trained  men  were  needed  to  solve  them.  In  general,  it 
may  be  said  that  agricultural  research  began  with  simple  empirical  tests,  but  has 
gradually  developed  until  it  has  now  attained  a  scientific  basis.   In  the  short  space 
of  75  years,  so  much  subject  matter  has  been  accumulated  in  the  field  of  agriculture 
that  no  one  man  could  hope  to  be  familiar  with  all  of  it.  This  led  to  specialization 
within  the  field  between  1°00  and  1910  in  America.  The  branches  recognized  in  most 
agricultural  colleges  and  experiment  stations  are:  Agronomy,  animal  husbandry, 
horticulture,  entomology,  forestry,  home  economics,  and  veterinary  medicine  or  path- 
ology. 

Agronomy  as  a  science  was  developed  from  the  old  style  variety  trials,  crop  rotation 
tests,  and  soil  culture  experiments,  when  field  culture  was  an  empirical  art.  Re- 
search workers  and  others  interested  in  the  science  of  crops  and  soils  formed  the 
American  Society  of  Agronomy  in  1907.   In  regard  to  Agronomy,  Carleton  (1907)  states: 
"As  a  science  it  investigates  anything  and  everything  concerned  with  the  field  crop, 
and  this  investigation  is  supposed  to  be  made  in  a  most  thorough  manner,  just  as 
would  be  done  in  any  other  science".  Thus,  agronomy  is  the  laboratory  and  workship 
of  many  sciences:  Agrostology,  chemistry,  botany,  ecology,  genetics,  pathology, 
physics,  physiology,  and  others  concerned  with  the  problems  of  crops  and  soils. 
Ball  (1916)  early  observed  that  it  has  been  necessary  for  the  experimenter  (in 
agronomy)  to  turn  from  the  gross  aspect  to  minute  detail  in  order  to  solve  some  of 
Its  problems.  Empirical  knowledge  has  been  rapidly  supplemented  by  fundamental  in- 
formation as  a  result  of  organized  research  and  the  improvement  in  its  technique. 

II.  Establishment  of  Experiment  Stations 

It  is  difficult  to  realize  that  the  present  large  network  of  experiment  stations  in 
this  country  and  in  other  parts  of  the  world  has  been  established  in  the  past  IOC 
years.  In  fact,  the  science  of  agriculture  practically  began  with  this  movement. 

(a)  First  Experiment  Station 

Jean  Baptiste  Boussingault  established  the  first  experiment  station  in  183^, 
being  the  first  man  to  undertake  field  experiments  on  a  practical  scale.  He  farmed 
land  at  Bechelbronne,  Alsace,  where  he  carried  on  research  of  a  high  calibre.  Bous- 
singault set  out  to  investigate  the  source  of  nitrogen  in  plants,  and  systematically 
weighed  the  crops  and  the  manures  applied  for  them.  He  analyzed  both  and  prepared  a 
balance-sheet.  Furthermore,  this  investigator  studied  the  effects  on  plants  when 
legumes  were  in  the  rotation.  He  concluded  that  plants  obtained  most  of  their  nitro- 
gen from  the  soil. 
(See  Chapter  2.) 


-1- 


(°)  Rothamst gd  Experimental  Stat i on 

The  Rothamsted  Experimental  Station  was  established  --"by  John  Bsnnot  Lowes  on 
his  farm  in  England  in  iQKl .  Hall  (1905),  in  his  account ,  states  that  "Rothamsted 
is  now  a  household  word  wherever  the  science  of  agriculture  is  studied."  Lawes 
found  that  phosphates  wore  important  fertilizers  and  discovered  a  method  to  make 
phosphate  fertilizer  "by  the  application  of  sulfuric  acid  to  phosphate  rock.  Formerly 
hones  were  used  as  a  sole  source  of  phosphates.  This  significant  discovery  led  to 
experimentation  on  a  large  scale.  The  systematic  field  experiments,  "begun  in  18^5 
and  continued  to  this  day,  have  dealt  particularly  with  soil  fertilizers  and  crop 
rotation.  These  experiments  long  have  teen  models  for  carefully  planned,  experiments. 
Lawes  was  aided  by  Dr.  J.  H.  Gilbert,  who  commenced  work  at  Rothamsted  in  1.8^3 •  The 
two  men  worked  together  for  57  years.  Recently,  Dr.  R.  A.  Fisher  has  brought  about 
modifications  in  the  field  experiments  to  make  them  amenable  to  statistical  treatment. 

( c )  American  Experiment  Stations 

Some  of  the  early  history  of  American  experiment  stations  is  given  by  True 
(1937)  and  by  Shepardson  (1929).  South  Carolina  went  on  record  as  favoring  an  ex- 
periment station  in  1785,  but  the  general  movement  for  the  establishment  of  experi- 
ment stations  began  about  I87I  because  of  the  attention  attracted  by  the  experiments 
of  Lawes  and  Gilbert  of  England.  In  the  meantime,  the  Morrill  Act  signed  by  Lincoln 
in  1862,  provided  for  the  so-called  land-grant  colleges  for  the  study  of  agriculture 
and  mechanic  arts.  California  established  the  first  experiment  station  in  lQl~),    and 

I  began  field  experiments  on  deep  and  shallow  plowing  for  cereals.  A  station  was 
started  in  North  Carolina  in  I877,  after  which  many  others  followed.  The  Hatch  Act, 
passed  by  Congress  in  I087,  was  the  start  of  the  present  experiment  stations.  Twelve 
were  in  existence  at  that  time.  Increased  funds  were  provided  by  the  Adams  Act  in 
1906,  by  the  Purnell  Act  in  1925,  and  by  the  Bankhead -denes  Act,  in  1255.  The  United 
I  States  Department  of  Agriculture  has  been  of  rather  recent  origin,  the  Secretary  be- 
coming a  Cabinet  member  in  I889.  At  present,  the  federal  government  controls  funds 
given  to  the  states  for  experimental  work.  In  general,  the  system  has  been  satis- 
factory because  it  has  proved,  to  be  participation  and  coordination  rather  than  con- 
trol. 

Ill .  Reasons  for  Public  Support  of  Agricultural  Research 

There  has  been  some  criticism  on  the  use  of  public  funds  for  agricultural  research, 
but  their  use  has  been  justified  on  the  grounds  that  the  welfare  of  agriculture  is 
basic  to  the  nation.  In  addition,  it  would  be  almost  impossible  to  place  agricultur- 
al research  on  a  self-supporting  "basis  because  the  results  of  research  are  so  diffi- 
cult to  control  through  patents  or  otherwise. 

( a)  Agricultural  Welfare 

There" are  6,000,000  farmers  and  360,000,000  to  365,000,000  acres  of  culti- 
vated land  in  this  country,  some  of  which  ha3  been  cultivated  more  than  300  years. 
The  virgin  fertility,  in  many  cases,  has  been  exhausted.  The  experiences  and  needs 
of  these  farmers  are  significant  because  the  prosperity  of  the  nation  depends  to  a 
large  extent  upon  agriculture.  The  production  of  food  and  fiber  is  fundamental  to 
the  public  welfare,  as  research  that  leads  to  lower  cost  of  production  passes  its 
benefits  on  to  the  consumer.  Haskell  (1923)  calls  attention  to  the  fact  that,  in 
the  case  of  crop  losses  due  to  diseases  and  other  factors,  the  consumer  ultimately 
pays  a  higher  price  for  his  food.  He  pays  for  depleted  soil  fertility  in  the  same 
way.  Thus,  the  state  may  actually  gain  more  from  the  benefits  of  research  than  the 
farmer  himself. 

(b)  Limitations  of  Farm  Experiences 

It  has  been  impossible  for  several  reasons  to  collect  scientific  information 
of  much  value  from  farm  experiences,   (l)  Inadequate  Farm  Records :  The  results  ob- 


tained  by  farmers  are  inaccurately  and  incompletely  recorded  from  the  experimental 
viewpoint.  Their  experiences  are  generally  limited  to  acres  and  yields  such  as 
found  in  stories  in  the  farm  press.  Farmers  very  often  place  undue  emphasis  on  the 
unusual.  (2)  Failure  to  Consider  all  Factors;  The  essence  of  scientific  progress 
is  to  determine  "why".  Among  the  many  variables  in  agriculture,  variation  in  season 
is  exceedingly  important  and  may  over-shadow  all  other  factor's.  The  farmer  is  quite 
likely  to  base  his  judgment  and  conclusions  on  the  results  of  one  or  two  year's  per- 
formance. Thome  (1909)  states  that  many  experiments  which  farmers  attempt  are 
valueless  or  misleading  "because  of  failure  to  observe  some  essential  condition  of 
experimentation.  (3)  Inadequate  Training;  As  a  rule,  the  farmer  lacks  the  training 
or  experience  necessary  for  the  evaluation  of  experimental  results.  Hall  (1905) 
makes  this  statement :   "Agricultural  science  involves  some  of  the  most  complex  and 
difficult  problems  the  world  is  ever  likely  to  have  to  solve,  and  if  it  is  to  con- 
tinue to  be  of  benefit  to  the  farmer,  investigations,  so  far  as  their  actual  conduct 
goes,  must  quickly  pass  into  regions  where  only  the  professional  scientific  man  can 
hope  to  follow  them  ...."  (h)   Inadequate  Funds;  Farmers  lack  the  funds,  help,  and 
equipment  necessary  for  experimental  work.  Experimentation  is  quite  expensive  since 
practical  considerations  are  necessarily  put  aside.  An  experiment  must  be  conducted 
with  precision  in  order  to  obtain  reliable  results,  rather  than  for  financial  return. 
For  instance,  Hall  (1905)  tells  that  some  F.othamsted  fields  have  grown  wheat  for  60 
years,  year  after  year,  on  the  same  land.  As  the  modern  farmer  seldom  grows  wheat 
continuously,  he  looks  upon  this  experiment  as  hopelessly  impractical  when  it  is 
pointed  out  to  him  on  field  days.  Nevertheless,  this  very  test  furnished  the  bulk 
of  the  early  proof  that  losses  in  yield  would  result  from  continuous  wheat  culture. 
The  aim  of  the  Fothamsted  test,  as  it  continues,  is  to  find  out  how  the  wheat  plant 
grows . 

IV.  Experiment  Station  Funds 

Agricultural  research  in  this  country  is  publicly  financed  almost  altogether.  Feder- 
al and  state  agencies  spent  about  25  million  dollars  on  agricultural  research  for 
the  year  1927-28.  This  total  sum  represented  approximately  0.20  percent  of  the  gross 
income  for  agricultural- products,  a  figure  wholly  within  reason. 

^a)The  Hatch  ActV 

~~TTTe  f IrKtrTederal  subsidy  for  agricultural  research  was  the  Hatch  Act,  passed 
in  I887.  It  gave  each  state  $15,000  per  year,  a  wide  latitude  in  the  use  of  the 
funds  being  permitted.  The  Act  made  it  possible  to  conduct  original  experiments  or 
verify  experiments  along  lines  as  follows:   (1)  physiology  of  plants  and  animals; 
(2)  Diseases  of  plants  and  animals  with  remedies  for  the  same;  (5)  the  chemical  com- 
position of  useful  plants  at  different  stages  of  growth;  (k)   rotation  studies; 
(5)  testing  the  adaptation  of  new  crops  and  trees;  (6)  analyses  of  soils  and  water; 
(7)  chemical  composition  of  manures,  natural  and  artificial,  and,  their  effect  on 
crops;  (8)  test  the  adaptation  and  value  of  grasses  and  forage  plants;  (9)  test  the 
composition  and  digestibility  of  different  foods  for  domestic  animals;  (10)  research 
.on  butter  and  cheese  production;  and  (11)  examination- and . classification  of  soils. 
None  of  the  funds  can  be  used  for  the  purchase  or  rental  of  lands  or  expenses  for 
farm  operations . 

(b)  The  Adams  Act 

A  similar  amount  of  money  was  granted  to  the  states  by  the  Adams  Act,  passed 
in  1906.  The  funds  must  be  used  for  original  researches  or  experiments  that  bear 
directly  on  agriculture.  Research  of  a  fundamental  nature  is  required  under  this 
-forid.  Norjs  .of  the  money  can  be  applied  to  substations,  or  to  the  purchase  or  rental 
of  land . 


(c)  The  Purnell  Act 

The  Purnel  Act  passed  in  1925  provided  for  additional  funds  which  now  amount 
to  $60,000  per  year  for  each  state.  These  funds  must  "be  used  on  specific  projects, 
hut  the  requirements  are  lees  exact  than  for  the  use  of  Adams  funds .  The  Act  pro- 
vides for  investigations  on  the  production,  manufacture,  preparation,  use,  distribu- 
tion, and  marketing  of  agricultural  products. 

( &)  The  Bankhead -Jones  Act      ' 

Certain  difficulties  in  the  use  of  experimental  funds  for  broad  general  pro- 
jects led  to  the  passage  of  the  Bankhead -Jones  Act  in  1935  which  will,  in  five  years 
(19^0),  provide  $5,000,000  for  research.  Its  provisions  have  "been  described  as 
follows:   "To  conduct  scientific,  technical,  economic,  and  other  research  into  laws 
and  principles  underlying  basic  problems  of  agriculture  in  its  broadest  aspects....". 
It  also  authorizes  research  for  the  improvement  of  quality  of  agricultural  commodi- 
ties and  for  the  discovery  of  uses  for  farm  products  and  by-products.  The  U.  S. 
Department  of  Agriculture  receives  ko   per  cent  of  this  fund,  while  60  per  cent  is 
allotted  to  the  states  on  the  basis  of  rural  population.  It  is  generally  understood 
that  the  funds  must  be  used  for  new  lines  of  work. 

V .  The  Personal  Equation  in  Research 

As  for  agricultural  research  in  general,  successful  agronomic  research  depends  upon 
the  ability,  permanency,  and  honesty  of  the  workers.  The  personnel  for  investiga- 
tional work  must  be  well-trained  in  the  basic  sciences  as  well  as  leaders  in  agricul- 
tural thought.  Their  outlook  must  be  broad. 

(a)  Education  for  Investigational  Work 

The  amount  of  training  necessary  for  research  is  great.  The  investigator 
must  be  skilled  in  the  art  of  agronomy  and  trained  in  the  closely  related  sciences. 
In  fact,  he  should  have  an  adequate  educational  background  before  research  is  even 
attempted.  A  good  foundation  in  English,  physics,  and  chemistry  are  basic  for  all 
research  in  agriculture.  Biology  adds  the  conception  of  organism,  while  mathematics 
is  the  common  instrument.  Thorough  training  in  all  branches  of  botanical  science  is 
desirable  in  agronomy.  This  includes  taxonomy,  anatomy,  physiology,  pathology,  etc. 
Other  sciences  that  are  useful  are:  Geology,  bacteriology,  genetics,  and  statistics. 
Among  the  authorities  who  agree  on  this  general  type  of  background  are  Howard  (1924), 
Wheeler  (19U),  Ball  (1916),  Carleton  (1907),  and  Richey  (1937).  A  practical  view- 
point is  necessary,  but  this  is  largely  the  result  of  boyhood  training  and  common 
sense. 

(b)  Qualities  in  Successful  Research  Men 

There  is  some  question  about  the  successful  scientist  necessarily  being  a 
genius.  The  term  should  be  qualified  to  include  perseverance,  common  sense,  and  in- 
finite pains.  Howard  (1924)  emphasized  the  qualities  needed  when  he  said:   "Here  the 
man  is  everything;  the  system  is  nothing."  (1)  Imagination:  Some  imagination  is 
essential  in  the  research  worker  but,  of  course,  it  must  be  scientific  imagination. 
(2)  D i s cr imlnat i on :  An  investigator  must  have  the  power  of  discrimination,  that  is, 
he  must  be  able  to  recognize  the  essentials  and  non-essentials  in  research.  He  must 
select  the  features  which  are  most  worthwhile.  It  is  possible  to  record  too  much 
data  on  a  subject,  and  thus  cloud  the  entire  issue.  (3)  Accuracy:  There  is  a  great 
need  for  accuracy  in'  experimental  work.  An  investigator  should  record  only  those 
notes  whose  reliability  is  well  established.  One  should  never  take  measurements  so 
fine  that  they  imply  false  accuracy.  The  figures  taken  by  an  investigator  should 
give  him  confidence  in  his  work,  (k)   Honesty  in  observation:.  The  investigator  should 
always  accept  observations  without  regard  to  their  agreement  with  his  own  precon- 
ceived ideas.  One  should  record  only  the  things  he  sees.   (5)  Fairness :  The  re- 
search man  should  give  due  credit  to  others,  and  keep  within  his  own  field  unless  a 


phase  of  his  work  calls  for  cooperation  with  others.  (6)  Enthusiasm:  One  should  he 
enthusiastic  about  his  vork,  being  ready  to  put  in  long  hours  or  extra  time  when 
necessary.  Call  (1922)  says  there  must  be  a  love  for  the  work  so  great,  in  those 
engaged  in  research,  that  it  will  enable  him  to  push  forward  in  the  face  of  obsta- 
cles which  may  seem  insurmountable.   (7)  Courage:  One  should  always  have  the  cour- 
age of  his  .convictions.  He  should  not  be  afraid  to  try  something  new. 

( c )  Initiative  in  Experimental  Projects 

The  project  system  has  an  enormous  value  in  the  coordination,  continuity, 
and  conclusion  of  agricultural  experimental  work  because  it  requires  the  submission 
of  an  outline  and  its  approval  before  any  work  is  done.  Success  in  experimental 
projects  depends  upon  the  leader,  his  scientific  attitude,  depth  of  motive,  concep- 
tion of  the  problem,  and  its  requirements.  Not  all  research  is  good.  In  fact, 
there  is  a  chance  for  much  waste.  .  While  partial  failure  is  inevitable,  it  is  possi-  . 
ble  for  the  investigator  to  gauge  plausible  success.  Allen  (1930)?  advises  research 
workers  to  think  scientifically,  avoid  adherence  to  routine,  and  keep  abreast  of  the 
times.  The  investigators  should  avoid  the  belief  that  his  own  compartment  is  water 
tight  and  self-sufficient. 

VT.  Results  of  Agronomic  Research 

Many  contributions  have  been  made  in  crops  and  soils  by  the  experiment  stations, 
particularly  during  the  past  25  years.  Some  of  the  more  important  advances  in  the 
past  quarter  of  a  century  in  field  crops  are  summarized  by  Warburton  (1933)  while 
those  in  soils  are  given  by  Lipman  (1933)* 

(a)  Field  Crops 

Among  the  contributions  in  corn  have  been  the  discovery  that  the  show -type 
ear  is  unrelated  to  its  performance  in  the  field,  that  ear-to-row  breeding  may  not 
lead  to  improvement  in  corn  yields,  and  that  the  combination  of  inbred  lines  in 
hybrids  has  resulted  in  higher  corn  yields.  In  wheat,  the  discovery  of  rust  resist- 
ance and  of  physiologic  races  has  enabled  investigators  to  breed  for  resistant 
varieties.  The  seme  is  true  for  bunt.  The  introduction  and  use  of  sorghums,  as  well 
as  their  improvement,  has  resulted  in  their  production  throughout  the  west.  The  :' 
cause  of  flax  "sickness"  has  been  discovered  as  due  to  wilt  with  the  result  that  re- 
sistant varieties  have  been  bred.  Sweet clover,  once  a  weed,  has  been  found  to  be  a 
valuable  crop.  Many  improved  varieties  of  crops  have  been  developed  for  disease  re- 
sistance, drouth  resistance,  high  quality  or  high  yield.  Marquis  wheat  is  one  of  the 
most  widely  known  improved  varieties. 

Tillage  has  been  shown  to  be  beneficial  because  of  weed  control  rather  than 
moisture  conservation  from  a  dust  mulch.  Both  Funchess  (1929)  and  Richey  (1937) 
have  given  similar  lists  of  advances  made  in  field  crop  science. 

(b)  Soils 

A  quarter-century  ago,  physical -chemical  analyses  of  soil  without  other  data 
were  frequently  erroneous  as  a  basis  for  the  estimation  of  the  agricultural  value  of 
soils.  In  recent  years  some  of  the  more  valuable  contributions  have  been  as  follows: 
(1)  Use  of  mineral  fertilizers  to  Improve  soil  fertility;  (2)  ionic  exchange  in  soil 
colloids  that  led  to  an  explanation  of  alkali-soil  formation;  (3)  soil  classification 
and  soil  survey;  (k)    soil  acidity  in  its  relation  to  plant  growth;  (5)  soil  colloids 
and  their  properties;  (6)  soil  bacteria  and  other  organisms  end  their  influence  on 
soil  fertility;  and  (7)  soil  erosion  and  its  control.  That  soil  productivity  may  be 
maintained  for  a  long  period  of  time  by  the  use  of  sound  rotation  and  manurial  prac- 
tices has  been  shown  by  the  Morrow  plots  at  Illinois.  The  results  for  39  years  have 
been  summarized  by  De  Turk,  et  al .  (1927).- 


VII .  Value  of  Early  Agronomic  Experiments         '•  ... 

Some  of  the  investigational  work  in  agronomy  "before  1910  was  of  little  value  due  to 
errors  in  the  experiments ,  many  of  which  were  great  enough  to  vitiate  the  conclusions. 
Contradictions  were  common.  As  Piper  and  Stevenson  (1910)  point  out,  results  were 
sometimes  suppressed  "because  they  failed  to  coincide  with  current  theory.   "In  short, 
all  scientific  evils  necessarily  associated  with  experimental  methods  are  too  evident 
in  the  field  work  in  agronomy."  The  same  type  of  criticism  applies  to  other  agri- 
cultural branches  at  that  time.  There  were  many  reasons  for  this  situation.  The 
"guess  method"  was  widely  used  by  the-  old  school  of  experimenters  for  the  accumula- 
tion of  information.  They  usually  lacked  facte,  lacked  a  broad  outlook,  were  limited 
in  their  experiences  and,  in  many  cases,  had  wide  differences  in  viewpoints.  Some  of 
the  short  coinings  have  been  due  to.  pressure  for  information  with  the  result  that  the 
conclusions  were  often  based  on  too  few  data.  Other  weaknesses  were  due  to  the  view- 
point in  some  quarters  that  empirical  facts  were  preferable  to  fundamental  informa- 
tion from  a  practical  standpoint. 

While  many  of  the  early  experiments  would  be  inacceptable  today  in  the  light  of 
modern  experimental  standards,  they  nevertheless  contributed  to  progress.   Some  of 
them  were  as  well  conducted  as  those  of  today.  Early  agricultural  practices  were 
determined  quite  as  much  by  opinion  as  by  experiment.   It  would  have  been  a  poor  ex- 
periment indeed  we're  it  to  be  less  reliable  than  unsupported  opinion.  These  early 
experiments  must  be  evaluated  in  relation  to  the  knowledge  of  the  time  as  well  as 
their  effects  on  agricultural  science  and  practice.  For  example.,  the  early  Sotham- 
sted  investigations  on  the  source  of  nitrogen  in  plants  finally  led  to  a  solution  of 
the  problem  even  tho  the  field  experiments  conducted  in  connection  with  them  would 
be  considered  today  as  inadequate. 

Many  of  the  weaknesses  in  early  experiments  have  been  met  gradually  through  (l)  wider 
application  of  modern  statistical  methods,  (2)  replication  of  plots  or  treatments, 
and  (3)  wider  use  of  the  inductive  or  scientific  method  in  which  general  principles 
are  sought  rather  than  empirical  facts. 

VIII .  Present  Trends  in  Agronomic  Research 

Some  very  definite  trends  are  apparent  in  modern  agronomic  research,  among  them  being 
the  emphasis  on  design  of  experiments,  long-time  projects,  and  regional  coordination 
of  research. 

(a)  Design  of  Experiments 

In  recent  years,  a  great  deal  of  stress  has  been  placed  on  the  design  of 
experiments.  The  field  lay-out  and  the  method  of  analysis  of  the  data  are  coordinat- 
ed so  as  to  lead  to  more  efficient  experimental  results.  The  emphasis  on  design  has 
been  made  by  the  Pothamsted  workers.  Design  focuses  attention  on  the  objects  of  an 
experiment  that  can  be  attained  in  no  other  way.  This  trend  promises  to  reduce  the 
number  of  situations  where  an  experiment  is  conducted  and  data  collected  before  a 
method  of  analysis  is  conceived. 

(b)  Long-time  Projects 

Another  definite  trend  is  toward  the  long-time  .project.  According  to  Henry 
Wallace  (1936),  "The  solution  of  problems  related  to  crop  production  is  a  matter  of 
years.  The  improvement  of  plants  by  breeding  must  extend  through  many  generations. 
Varieties  must  be  compared  in  a  number  of  different  kinds  of  seasons  for  correct 
evaluation.  The  same  is  true  of  tests  of  fertilizers,  spraying  practices,  and  cul- 
tural methods.  To  be  productive,  a  program  of  plant  research  accordingly  must  be 
stable,  with  a  concentration  of  effort  until  a  given  problem  is  solved  or  its  solu- 
tion found  impractical  for  the  time  being." 


(c)  Regional  Coordination  of  Research 

The  regional  coordination  of  research  work  to  reduce  duplication  of  effort 
is  "being  regarded  more  and  more  as  essential.   It  has  "been  stressed  "by  Call  ( 193*0  > 
Jarvis  (1931)*  and  others.  Agronomic  research  "began  as  isolated  hits  of  investiga- 
tion to  solve  local  problems.  Cooperation  and  coordination  was  developed  later  to 
reduce  wasteful  duplication.  It  also  makes  possible  a  comprehensive  attack  on  intri- 
cate problems,  as  well  as  the  elimination  of  artificial  boundaries.  Such  effort  en- 
courages personal  contacts  and  exchanges  of  ideas  between  different  investigators. 
Various  bureaus  of  the  U.  S.  Department  of  Agriculture  took  the  leadership  in  region- 
al coordination.  The  most  formal  efforts  on  regional  coordination  are  in  the  north- 
eastern states  on  pasture  investigations  and  soil  organic  matter  studies.  The  limi- 
tation of  initiative  and  individuality  of  investigators  lias  sometimes  been  feared  as 
a  result  of  regional  coordination,  but  for  the  most  part  appears  to  be  unfounded. 
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Questions  for  Discussion 


1.  What  conditions  led  to  the  subdivision  of  the  agricultural  field? 

2.  What  are  the  functions  of  agronomy  as  a  science?  Why? 

3-  Who  founded  the  first  experiment  station?  What  results  were  obtained? 

k.   When  was  the  Rothamsted  Experimental  Station  established?  Where?  By  whom?  Why? 

5.  What  has  Rothamsted  contributed  to  early  agricultural  science? 

6.  Where  was  the  first  American  experiment  station  established?  When?  Upon  what 
did  It  work? 

7.  Give  some  reasons  to  justify  agricultural  experimentation  as  a  public  duty. 

3.  Why  Is  agriculture  in  America  a  national  concern  second  to  none? 

9.  Give  several  reasons  why  a  farmer  is  generally  unable  to  do  experimental  work. 

10.  Name  the  acts  of  Congress  that  contributed  to  the  agricultural  experiment  sta- 
tions, together  with  their  dates  of  passage. 

11.  What  special  requirements  is  necessary  for  the  expenditure  of  funds  under  the 
Adams  Act?  Bankhead- Jones  Act?  Hatch  Act? 

12.  What  kind  of  basic  training  Is  necessary  for  agronomic  research? 

13.  What  would  you  consider  as  some  of  the  most  important  attributes  of  a  successful 
investigator? 

Ik,   Discuss  briefly  the  following  characteristics  in  relation  to  research: 

(l)  imagination,  (2)  classification,  {'))   discrimination,  {h)   accuracy,  and 
( 5 )  t horoughne  s  s . 

13-  Why  has  the  project  system  been  useful  in  research? 

lo.  Name  five  contributions  to  crop  knowledge  made  by  experiment  stations.  Five  con- 
tributions to  soil  science. 

17.  What  are  some  reasons  for  the  early  contradictions  in  agronomic  science? 

18.  What  was  the  value  of  early  agronomic  experiments?  What  were  some  of  their  weak- 
nesses? 

19-  Name  and  discuss  three  trends  in  agronomic  research  at  the  present  time. 


CHAPTER  II 
HISTORY  OF  BASIC  PLANT  SCIENCES  :.  /  .  " 

I.  Early  History  of  Basic  Sciences 

Agronomy  as  a  science  "began  with  the  establishment  of  the  first  experiment  station 
by  Jean  Baptiste  Boussingault  in  183k,  although  many  empirical  facts  were  known  be- 
fore that  time.  '.• 

(a)  Early  Science 

Science,  in  general,  dates  from  Aristotle  who  was  the  founder  of  zoologj  and 
the  forerunner  of  evolution.   .  A  a   one  of  the  founders  of  the  inductive  method  he 
first  conceived  the  idea  of  organized  research.  In  fact,  his  principles  might  well 
be  observed  at  the  present  time.  After  Aristotle,  little  progress  was  made  for  2,000 
years.  Among  his  theories  was  the  one  that  the  universe  was  composed  of  four  ele- 
ments: air,  earth,  fire,  and  water.  This  was  accepted  for  centuries  because  the 
habit  was  to  assume  some  man  as  an  authority  rather  than  to  investigate.  At  the 
beginning  of  the  17th  century,  Newton  and  Galilee  began  to  base  conclusions  on  facts. 
Francis  Bacon  wrote  books  which  emphasized  that  theories  should  be  based  on  facts 
rather  than  on  authorities. 

(b)  Reasons  for  Slow  Progress  in  Science 

Progress  in  agricultural  science  has  had  to  wait  on  discoveries  in  the  basic 
sciences  of  physics  and  chemistry.  There  are  many  reasons  for  the  slow  development 
of  science  in  past  ages,   (l)  Slavery  was  the  general  rule,  with  the  result  that 
there  was  little  stimulus  to  improve.  (2)  Experimenters  lacked  accurate  instrument's 
for  measurement.  (3)  The  mildness  of  the  climate  in  the  early  civilized  ceuntries 
restricted  industry,  (k)   Mathematical  science  was  restricted.   (5)  The  scientific 
method  developed  by  Aristotle  was  seldom  used.  Instead,  it  was  the  habit  to  assume 
a  general  law.   (6)  Superstition  and  interference  by  the  clergy  discouraged  experi- 
mentation. 

II.  Development  of  Agricultural  Science 


There  was  little  activity  in  the  sciences  related  to  agriculture  before _l800.  Funda- 
mental discoveries  at  the  close  of  the  l3th  century,  together  with  the  appearance  of 
several  treatises  on  agriculture,  started  .rap  id  development.  'Sir  Humphrey  Davy 
(1813)  published  a  book  entitled  "Essentials  of  Agricultural  Chemistry"  in  which  he 
brought  together  many  known  facts.   A  vcn  Thaer  (l8l0)  published  a  book  on  "Reasons 
for  Agriculture"  in  which  he  emphasized  the  value  of  humus  in  the  soil,  from  which  he 
believed' plants  gained  their  carbon.  In  18U0,  Justus  von  Liebig  published  hir:  book  or 
organic  chemistry  in  relation  to  agriculture,  in  which  he  advocated  that  the  soil 
need  only  be  supplied  with  minerals..  -This  latter  work  struck  the  scientific  world  as 
a  thunderbolt.  It  has  had  a  great  deal  of  influence  on  modern  agricultural  research. 

The  establishment  of  the  Rothamsted  Experimental  Station  in  I838  also;  reflected  the 
interest  JLh  agricultural  science.  The  discoveries  important  to  agriculture  since 
I85O  have  been:   (1)  The  theory  of  evolution,  (2)  the  discovery  of  anaerobic  bacteria, 
(3)  the  source  of  nitrogen  in  plants  thru  the  aid  of  bacteria,  (h)   Mendel's  laws  of 
heredity;  (5)  the  chromosome  theory  of  heredity  as  a  physical  basis  for  inheritance; 
and  (6)  the  discovery  of  vitamins.  .   . 
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A  —  Plant  Nutrition 

III.  Early  Plant  Discoveries 

Very  little  information  -was  gathered  on  plant  science  from  the  time  of  the  Greeks  up 
to  the  Renaissance,   (l)  Theophrastus :  Published  a  "book  on  plants  entitled  "Enquiry 
into  Plants."  He  classified  plants  into  herbs,  shrubs,  and  trees.  Theophrastus  al- 
so distinguished  bulbs,  tubers,  and  rhizomes  from  true  roots.  Plant  adaptation  was 
discussed.   (2)  Al  Farbi:  Discovered  respiration  in  plants  about  950  A.D.   (3) 
Johann  van  Helmont:  This  worker,  who  lived  in  the  17th  centur^r,  believed  that  water 
was  transformed  into  plant  material .  He  placed  200  lbs .  of  soil  in  a  receptacle  and 
grew  a  willow  in  it.  Nothing  was  added  but  water.  At  the  end  of  five  years,  he 
found  the  willow  weighed  169  lbs.  and  3  oz.,  while  the  original  soil  lost  only  2  02. 
from  its  original  weight.  He  concluded  that  the  growth  came  from  the  water  alone, 
but  failed  to  consider  the  air.  (k)   Jethro  Tull:  Believed  that  earth  was  the  true 
food  of  plants  and  that  they  absorbed  soil  particles.  Therefore,  he  believed  it 
necessary  to  finely  pulverize  the  soil  through  cultivation.  Tull  developed  cultural 
Implements  and  devised  a  system  to  plant  crops  in  rows. 

TV.  Source  of  Nitrogen  in  Plants 

The  period  from  18^0  to  1885  was  taken  up  largely  with  the  Rothamsted-Liebig  contro- 
versy on  the  source  of  nitrogen  in  plants. 

(a)  Earlier  Work  on  Nitrogen 

The  element  nitrogen  was  discovered  in  1772.  Joseph  Priestly,  followed  by 
Jans  Ingen-Hausz,  settled  the  fundamental  fact  that  green  plants  in  sunlight  decom- 
pose the  carbon  dioxide  from  the  atmosphere,  set  oxygen  free,  and  retain  the  carbon. 
This  source  of  carbon  accounts  for  the  bulk  of  dry  matter  in  plants.  From  his  work 
in  1804,  Theodore  De  Saussure  concluded  that  plants  were  unable  to  assimilate  free 
atmospheric  nitrogen,  but  obtained  it  from  the  .nitrogen  compounds  in  the  soil.  The 
pot  experiments  carried  out  by  J.  B.  Bouesingault,  who  began  his  investigations  in 
l8o4,  indicated  that  plants  draw  their  nitrogen  entirely  from  the  soil  or  manure. 

(b)  Liebig-Rothamated  Controversy 

Justus  von  Liebig  in  lQkd maintained  that  green  plants,  by  the  aid  of  sun- 
light, derive  their  total  substance  from  carbonic  acid,  water,  and  ammonia  present  in 
the  atmosphere,  and  from  simple  inorganic  salts  in  the  soil  which  are  afterwards 
found  in  the  ash  when  the  plant  is  burned.  Liebig  believed  combined  nitrogen  in  the 
soil  to  be  unnecessary  in  plant  nutrition.  This  view  was  disputed  by  Lawes  and  Gil- 
bert who  began  elaborate  experiments  at  Rothamsted  in  1857.  They  grew  plants  under 
glass  shades,  ammonia  from  the  air  being  kept  out.  The  earth,  pots,  manures,  etc., 
employed  in  the  experiment  were  burned  to  sterilize  them.  Carbon  dioxide  was  intro- 
duced as  required.  Lawes  and  Gilbert  made  their  trials  both  without  manure  and  with 
ammonium  sulfate .  Their  work  was  done  so  carefully  that  the  possibility  of  nitrogen 
fixation  by  plants  was  excluded.  While  they  concluded  that  plants  require  combined 
nitrogen  from  the  soil,  they  were  unable  to  account  for  the  gain  in  nitrogen  in  some 
plants  under  field  conditions.  They. found  actual  gains  in  nitrogen  when  leguminous 
plants  were  grown  in  the  field,  which  was  in  agreement  with  the  long  experiences  of 
practical  farmers. 

(c)  Final  Experiments  on  Nitrogen  Relations 

The  final  experiment  on  nitrogen  assimilation  by  plants  was  performed  by 
H.  Helriegel  and  H.  Wilfarth  who  found  the  symbiotic  relationship  between  bacteria 
and  legumes.  When  he  grew  plants  in  sand,  Helriegel  (1.886)  found  that  the  Gramineae, 
drucifereae,  Chenopodiaceae,  etc.,  grew  almost  proportionally  to  the  combined  nitro- 
gen supplied.  When  absent,  nitrogen  starvation  took  place  as  soon  as  the  nitrogen 
from  the  seed  was  exhausted.  In  legumes,  he  found  that  the  plants  were  able  to 
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recover  and. begin  luxurious  growth.  The. roots  always  had  nodules  on  them  in  such 
instances.  However,  legumes  grown  in  sterile  sand  behaved  the  same  as  other  plants, 
but  recovery  could  be  brought  about  when  a  watery  soil  extract  was  added  to  them. 
Renewed  growth  and  assimilation  of  nitrogen  was  found  to  depend  upon  the  production 
of  nodules  on  the  roots.  VFilfarth  (1887)  found  bacteria  in  the  nodules  and  settled 
the  point  that  bacteria  are  associated  with  nitrogen  fixation.  Later,  these  results 
were  confirmed  at  Rothamsted  and  final  proof  was  obtained  on  the  role  of  nitrogen  in 
plants.  As  Hall  (3.905)  recounts,  the  "very  vigor"  of  the  Rothamsted  laboratory  pre- 
vented fixation  of  nitrogen  by  the.  exclusion  of  all  possibility  of  inoculation.  The 
legumes  as  a  class  were  found  to  be  an  exception  to  the  contention  that  plants  could 
use  only  combined  nitrogen  from  the  soil.  Both  schools  were  partly  right. 

B  --  Evolution  and  Genetics 

V.  Early  Work  in  Genetics 

Although  many  facts  of  inheritance  were  known  previously,  Genetics  has  been  regarded 
as  a  science  only  since  1900.  At  that  time,  the  work  of  Gregor  Mendel,  originally 
published  in  I865,  was  brought  to  light.  Early  work  is  reviewed  by  Roberts  (1919; 
1936),  Zirkle  (1932,  1935),  and  by  Cook  (1937). 

( a)  Sex  in  Plants 

The  Bisexual  nature  of  the  date  palm  was  recognized  by  the  early  Babylonians 
and  Assyrians  5000  years  ago.  The  ancients  ascribed  many  monstrosities  to  hybridiza- 
tion. Many  theories  of  heredity  were  in  vogue,  but  no  experimental  data.  Theophras- 
tus  and  Pliny  discussed  sex  in  plants,  Primitive  men  made  improvements  in  crops, 
rice  and  maize  being  good  examples. 

About  lbOO  a  new  spirit  of  scientific  skepticism  began  to  be  manifest.  Many 
of  the  cumulative  absurdities  and  theories  were  being  put  to  experiment.  The  in- 
creased interest  in  biology  culminated  in  the  publication  of  the  famous  letter  by 
Camerarius  in  169^  on  sex  in  plants.  He  gave  convincing  evidence  that  plants  are 
sexual  organisms.  Sex  in  plants  was  demonstrated  by  actual  experiments  with  spinach, 
hemp,  and  maize.  .,  .,;-., 

(b)  Hybridization  of  Plants 

This  work  was  followed  by  the  production  of  the  first  artificial  plant  hybrid 
by  Thomas  Fair child  in  England,  a  short  time  before  171?.  In  the  next  50  years  there 
occurred  a  veritable  wave  of  hybridizing.  Crosses  between  more  than  a  dozen  genera 
were  made  by  several  investigators.  This  period  culminated  in  the  publication  of  the 
work  of  J.  G.  Koelreuter  (I76I-66)  in  which  he  reported  the  results  of  136  experiment* 
on  artificial  hybridization.  In  1793*  C.  K.  Sprengel  observed  cross  pollination  of 
plants  by  insect 3.  However,  Zirkle  (1932)  calls  attention  to  the  fact  that  insect 
pollination  was  observed  by  an  American  named  Miller  at  a  much  earlier  date.  From 
I760  to  1859  there  followed  many  experiments  on  plant  hybridization  in  attempts  to 
determine  the  nature  cf  inheritance.   In  1822,  John  Go 3 s  (England)  reported  but 
failed  to  interpret  dominance  and.  recessiveness,  and  segregation  in  peas.  A.  Sageret 
(France)  in  1826  classified  contrasting  characters  in  pairs,  using  muskme Ions  and 
cantaloupes.  K.  F.  von  Gaertner  reported  in  1835  on  hybridizations  made  with  107 
plant  species.  He  noted  plant  vigor  and  the,  uniformity  of  the  first  generation  after 
a  cross.  In  1863,  C.  V.  Naudin  (France)  published  a  memoir  on  hybridization  in  which 
he  almost  discovered  the  laws  of  inheritance. 

VI .  The  Theory  of  Evolution       . . .  '  \ : .  -'•   . ■  '  •   -'•••  "  v  .-•' 

The  theory  of  organic  evolution  is  one' of  the  most  profound  theories  expostulated  in 
the  past  300  or  hOO   years.  It  was  brought  to  fruition  in  the  publication  of  the. 
"Origin  of  Species"  \)y  Charles  Darwin  in  I859.  Hybrids  are  discussed  extensively, 
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but  its  contribution  to  genetics  was  mostly  indirect.  It  marked  the  "beginning  of 
the  modern  experimental  approach  to  biological  problems. 

(a)  Evolution  before  Darwin 

When  Darwin  published  the  "Origin  of  Species"  spontaneous  generation  and 
special  creation  were  the  current  theories.  A  great  majority  of  naturalists  believed 
that  species  were  immutable  productions  specially  created.  Up  to  this  time,  empiri- 
cal rather  than  scientific  improvement  had  been  made  in  plants.  Darwin  did  not 
originate  the  evolution  theory;  he  merely  furnished  evidence  for  its  substantiation. 
Aristotle  had  expressed  the  central  idea  of  evolution.  Modern  philosophy  from  Fran- 
cis Bacon  onward  shows  definiteness  in  its  grasp  and  conception.  Erasmus  Darwin, 
grandfather  of  Charles  Darwin,  had  a  theory  similar  to  that  propounded  in  the  "Origin 
of  Species."  J'.B.P.  de  Lamarck,  in  his  "Philosophic  Zoologique"  published  in  1809, 
made  the  first  attempt  to  produce  a  comprehensive  theory  of  evolution.  He  added  the 
idea  of  "use  and  disuse."  Lamarck  believed  in  the  inheritance  of  acquired  characters 
.and  attributed  some  influences  to  direct  physical  factors.  In  other  words,  all  the 
principal  factors  of  evolution  had  been  worked  out  before  the  time  of  Darwin  with 
the  possible  exception  of  "survival  of  the  fittest"  which  he  obtained  from  a  book  by 
Malthus  on  population.     ... 

(b)  The  Work  of  Charles  Darwin 

Darwin  made  an  extended  trip  around  the  world  in  the  Beagle,  collecting 
voluminous  facts  and  making  extensive  observations , in  support  of  his  theory.  He  is 
given  credit  for  the  evolution  theory  because  he  was  the  first  to  gather  facts.  He 
attempted  to  show  how  and  why  new  species  arose.   (1)  Theory  of  Natural  Selection: 
Present  organic  forms  are  believed  to  have  evolved  from  more  simple  forms  in  past 
ages.  The  theory  was  founded  on  these  facts:   (a)  Variations  between  individuals 
are  universally  present;  (b)  a  struggle  for  existence  takes  place  between  individuals; 
(c)  through  natural  selection  these  individuals  with  the  most  favorable  variations 
survive;  and  (d)  heredity  tends  to  perpetuate  the  favorable  variations  from  natural 
selection.  (2)  Reasons  for  Success:  Darwin  was  successful  because  of  his  thorough- 
ness, accuracy,  hard  work,  honesty,  ability  to  see,  and  because  he  was  a  stickler  for 
details.  He  showed  by  example  that  disinterestedness,  modesty,  and  absolute  fair- 
ness are  important  attributes  of  character  in  intellectual  work.  Darwin  (1859)  him- 
self states  that  his  success  was  due  to  a  love  of  science,  unbound  patience  for  long 
reflection  on  a  subject,  industry  in  the  collection  and  observation  of  facts,  as  well 
as  a  fair  share  of  invention  and  common  sense. 

VII.  The  Cell  in  Relation  to  Inheritance 

Independent  progress  was  being  made  in  other  fields  that  were  to  have  a  profound  in- 
fluence on  genetics  after  1900.  A.  von  Leeuwenlioek  (Holland)  discovered  the  micro- 
scope and  saw  mammalian  germ  cells  in  1677-  The  cell  theory  was  propounded  in 
1838-39  by  M.  J.  Schleiden  and  T.  Schwann  (Germany).  This  was  the  first  generalized 
statement  that  all  organisms  are  made  up  of  ceils--one  of  the  greatest  generaliza- 
tions of  experimental  biology.  The  union  of  sperm  and  egg  cells,  i.e.,  fertilization, 
was  first  seen  in  seaweed  by  G.  Thuret  (France)  in  184-9  •  A  year  later  he  showed  that 
the  egg  would  not  develop  without  fertilization.  The  chromosomes  were  described  in 
1875  by  E.  Strassburger  (Germany).  During  the  same  year  Oscar  Hertwig  (Germany) 
proved  that  fertilization  consists  of  the  union  of  two  parental  nuclei  contained  in 
the  sperm  and  ovum.  .W.  F lemming  (Germany)  in  1879-82  describes  the  longitudinal 
splitting  of  the  chromosomes,  and  later  observed  (1884-85)  that  the  halves  of  split 
chromosomes  went  to  opposite  poles.  Th.  Boveri  (Germany)  in  1887-88  verified  the 
earlier  prediction  of  A.  Vfeismann  that  reduction  in  the  ohrojir- Eome stakes  place.   In 
I898,  S.  G.  Wavashin  (Russia)  discovered  double  fertilization  in  higher  plants. 
Thus,  the  physical  mechanism  of  inheritance  was  pretty  well  worked  out  by  the  time 
that  the  work  of  Mendel  was  discovered. 
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VIII.  The-  Lavs  of  Inheritance  .  . 

The  turn  of  the  century  proved  to  he  an  epochal  year  in  the  experimental  study  of 
heredity.  The  work  of  Gregor  Mendel  (Austria),  an  August inian  monk,  on   Inheritance 
in  peas  was  rediscovered  in  1900  "by  Hugo  De  Vr5.es>  C.F.J.E.  Correns,  and  E.  von 
Techermak.  The  work  had  "been  published  originally  in  1866. 

(a)  Discovery  of  Principles  of  Heredity 

Mendel  made  crosses  of  peas  and  observed  carefully  the  resemblances  and 
differences  among  different  races.  He  began  his  work  in  1857*  The  principles  of 
heredity  which  he  put  forth  were  as  follows:   (l)  single  heredity  units,  (2)  allelo- 
morphism or  contrasted  pairs,  (3)  dominance  and  recessiveness,  (h)    segregation,  and 
(5)  combination.  The  last  two  are  generally  recognized  as  the  distinct  contributions 
of  Mendel . 

(b)  Methods  used  by  Mendel 

There  are  several  reasons  for  the  success  of  Mendel .  His  work  differed  from 
that  of  his  predecessors  in  several  respects.  (1)  He  .-made  actual  counts  and  kept 
records  of  each  generation.   (2)  One  pair  of  factors  was  studied  at  a  time.   (3)  His 
material  was  carefully  studied  and  selected.  (K)   He  guarded  against  errors  in  acci- 
dental crosses.   (5)  He  worked  with  large  numbers.  (6)  The  crosses  were  studied  for 
seven  generations.  Roberts  (1929)  comments  as  follows  on  the  work  of  Mendel: 
"Nothing  in  any  wise  approaching  this  masterpiece  of  investigation  had  ever  appeared 
in  the  field  of  hybridization.  For  far-reaching  end  searching  analysis,  for  clear 
thinking-out  of-  the  fundamental  principles  involved,  and  for  deliberate,  painstaking, 
and  accurate  following-up  of  elaborate  details,  no  single  piece  of  investigation  in 
their  field  before  his  time  will  at  all  compare  with  it,  especially  when  we  consider 
the  absolute  absence  of  precedent  and  initiative  for  tho  work.": 

IX.  Modern  Developments  in  Genet ic3 

The  universality  of  Mendelian  principles  was  verified' in  plants,  animals,  and.  man 
within  three  years.  In  1902,  Hugo  De  Vries  advanced  the  mutation  theory  to  explain 
sudden  changes  in  plants  that  breed  true,  but  which  could  not  be  accounted  for  by 
Mendelian  inheritance.  He  found  sudden  changes  in  the  evening  primrose  to  breed 
true  in  certain  cases.  These  mutations  were  believed  to  furnish  the  basis  for  evo- 
lution. This  was  soon  followed  by  the  pure-line  concept,  i.e.,  variations  in  the 
progeny  of  .a  single  plant  of  a.  self -fertilized  species  .arc  not  due  to  inheritance. 
This  was  first  put  forth  by  W.  L,  Jchannsen  (Denmark)  in  I905.  H.  TTilsson-Ehle 
(Sweden)  advanced  the  multiple -fact or  hypothesis  in  I908.  The  chromosome  theory  of 
heredity  was  announced  by  T .  H.  Morgan  in  1910.  His  gene  theory  included  the  prin- 
ciple of  linkage  of  genes  resident  on  the  same  chromosome.  This  brilliant  hypothe- 
sis has  been  upheld  in  many  experiments.  Much  recent  work  has  been  concerned  with 
polyploidy,  the  mechanism  of  crossing-over,  and  sterility. 

The  principles  of  genetics  have  enabled  plant  breeders  to  make  definite  contributions 
to  improved  varieties.  Many  new  varieties  are  now  grown  on  farms  that  have  been 
made  possible  through  application  of  the  laws  of  inheritance.  Many  varieties  are 
"made  to  order"  to  meet  particular  conditions. 

0  —  Other  Basic  Sciences 

X.  Development  of  Bacteriology  .. 

Great  advances  were  made  in  the  field  of  bacteriology  between  1.360  and  i860,  it  be- 
ing definitely  established  that  bacteria  bring  about  putrefaction,  decomposition, 
and  other  changes.  The  work  of  Louis  Pasteur  dominated  the  field  during  this  period, 
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(a)  Pasteur  and  his  Work 

Pasteur  discovered  anaerobic  "bacteria.  Fermentation  was  commonly  thought  to 
he  the  result  of  a  chemical  change,  hut  Pasteur  proved  it  to  he  due  to  anaerobic 
bacteria..  This  wrecked  the  theory  of  spontaneous  generation  of  life.  Pasteur  showed 
that  the  presence  of  bacteria  could  always  be  traced  to  the  entrance  of  germs  from 
the  outside,  or  to  growth  already  present.  Other  contributions  of  Pasteur  included 
the  discovery  of  the  causes  of  many  bacterial  diseases,  and  the  development  of 
methods  of  immunization.  Pasteur  had  several  attributes  that  led  to  his  success: 
(1)  He  established  truth  by  experiment;  (2)  He  was  discerning  with  regard  to  the 
problem  on  which  he  worked;  and  (3)  he  worked  on  one  problem  at  a  time. 

( b )  Other  Discoveries  in  Bacteriology 

Many  further  developments  in  bacteriology  depended  upon  the  improvement  of 
the  microscope  and  the  perfection  of  various  technics.  The  oil  immersion  lense  was 
developed  about  i860.  The  agar -plate  method  for  the  study  of  growing  colonies  of 
bacteria  was  introduced  by  Robert  Koch  in  l88l.  The  transformation  of  ammonia  to 
nitrates  was  demonstrated  by  T.  Schloesing  and  A,.  Muntz  in  I877,  but  it  remained  for 
S.  Winogradsky  to  isolate  the  organisms  concerned.  That  nodules  are  formed  on 
legumes  as  the  result  of  inoculation  with  microorganisms  was  demonstrated  by  H. 
Helriegel  and  H.  Wilfarth  in  1886.  M.  W.  Beijernick  isolated  non-symbiotic  bacteria, 
i.e.,  the  Azotobacter,  in  1901.  Among  other  contributions  of ■ bacteriology  were 
sterilization  technics,  the  classification  of  bacteria  on  a  physiological  basis 
(started  by  Ferdinand  Cohn  in  I872)  the  study  of  diseases  due  to  filterable  viruses, 
and  studies  in  the  nature  of  bacteriophagy . 

XI.  Plant  Pathology 

Like  all  natural  sciences,  plant  pathology  had  its  start  with  the  dawn  of  civiliza- 
tion. The  Hebrews  mentioned  plant  diseases  in  the  Bible,  but  only  gave  descriptions 
and  mentioned  damage.  Little  was  known  about  plant  diseases  until  the  modern  era 
which  began  about  I85O. 

One  of  the  greatest  early  workers  was  Anton  de  Bary  (German)  who  proved  the  parasi- 
tism of  Fungi  in  I853 .  A  little  later  (lQ6k)   he  proved  heteroecism  in  rusts  as 
illustrated  by  the  relation  of  the  aecidium  on  the  barberry  to  the  red  and  black 
rust  stages  on  wheat .  That  bacteria  may  cause  plant  diseases  was  first  proved  by 
Thomas  Burrill  in  1879-81 .  He  showed  that  a  definite  species,  Bacillus  amylovoris, 
was  the  causal  agent  of  fire  blight.  The  use  of  Bordeaux  mixture  as  a  fungicide  was 
started  in  France  in  1886.  Since  that  time,  many  other  fungicides  have  been  used  in 
plant  disease  control,  the  latest  being  the  organic  mercury  compounds. 

Biologic  strains  in  rusts  were  discovered  in  I89?4-  by  J.  Eriksson  (Sweden),  while 
races  within  a  variety  of  rust  were  demonstrated  by  E.  C.  Stakman  and  his  coworkers 
in  I916.  Another  important  discovery  was  made  by  J.  H. . Craigie  in  1927  when  he  dis- 
covered sexuality  in  the  rusts. 

A  rapid  increase  in  the  knowledge  of  so-called  virus  diseases  of  plants  has  taken 
place  since  the  first  proof  of  tobacco  mosadc  as  an  infectious  disease  in  1388.  The 
role  of  insects  in  the  transmission  of  the  virus  or  active  principle  was  soon  recog- 
nized. Recently,  W.  M.  Stanley  (1937)  bas  advanced  strong  evidence  that  the  tobacco 
mosaic  virus  is  due  to  a  high  molecular  weight  crystalline  protein. 

A  great  deal  of  attention  is  now  being  given  to  the  production  of  disease-resistant 
and  immune  varieties  of  crop  plants  through  the  application  of  genetic  methods. 
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Questions  for  Discussion 

1.  Who  is  considered  the  "Father  of  Science"?  Why? 

2.  What  were  the  contributions  of  Galileo,  Bacon  and  Newton  to  science? 
3-  Why  did  science  develop  slowly  previous  to  the  l6th  century? 

4.  Name  several  discoveries  important  to  agriculture  since  1850. 

5.  Who  was  Theophrastus,  and  what  did  he  do? 

6.  What  were  the  views  of  these  men  on  plant  nutrition:  Von  Helmont,  Jethro  Tull, 
Thaer,  Liebig? 

7.  What  did  deSaussure  contribute  to  agricultural  research?  Sir  Humphrey  Davy? 
Boussingault? 

8.  What  facts  made  the  source  of  nitrogen  in  plants  so  important  a  problem  during 
the  19th  century? 

9.  Mention  three  theories  that  were  proposed  to  account  for  the  supposed  extraction 
of  nitrogen  by  plants  from  the  ail*. 

10.  Describe  the  experiments  at  Bothamsted  conducted  to  determine  whether  or  not 
plants  secure  nitrogen  from  the  air.  What  was  the  result  of  these  experiments? 

11.  By  whom,  how,  and  when  was  the  source  of  nitrogen  of  legumes  discovered? 

12.  What  important  lessons  are  illustrated  by  the  investigations  relating  to  the 
source  of  nitrogen  in  plants? 

13-  What  did  these  men  contribute  to  early  plant  science:  Oamerarius,  Kolreuter, 
Sprengel,  and  Naudin? 

14.  Who  originated  the  theory  of  evolution?  Why  was  it  not  accepted  at  that  time? 

15.  What  was  the  status  of  plant  and  animal  improvement  at  the  time  of  publication 
of  the  "Origin  of  Species"? 

16.  What  did  Darwin  contribute  to  the  theory  of  evolution?  Why  is  he  usually  given 
credit  for  it? 

17.  What  is  the  theory  of  natural  selection? 
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18.  Describe  the  work  of  Mendel  and  tell  why  he  was  successful. 

19.  What  is  the  mutation  theory?  Chromosome  theory  of  heredity?  Multiple  factor 
hypothesis? 

20.  What  was  the  prevailing  belief  in  spontaneous  generation  of  life  when  Pasteur 
began  investigating  the  subject? 

21.  What  are  the  principal  contributions  of  Pasteur? 

22.  Name  several  advances  made  in  bacteriology  since  the  time  of  Pasteur? 

23.  Name  5  important  discoveries  in  plant  pathology. 


l/ 


CHAPTER  III 
LOGIC   HJ  EXPERIMEOTATION 


I.  Scope  of  Science 

Science  is  systematized  knowledge.  The  function  of  science  is  the  classification  of 
observations  and  the  recognition  of  their  sequence  and  relative  significance.   Its 
scope  is  to  ascertain  truth  in  every  "branch  of  knowledge.,  Sound  logic  is  just  as 
fundamental  to  good  science  as  accurate  data.  The  thought  process  most  important  in 
science  is  induction,  i.e.,  reasoning  from  the  particular  to  the  general.  General- 
izations may  lead  to  laws  and  principles  ahout  natural  phenomena. 

II.  Science  among  the  Ancients 

"Primitive  peoples  lived  through  thousands  of  years  of  myth  and  magic,  while  science 
was  rising  out  of  slow  and  unconscious  observations  of  natural  events,  "Weir  (1936) 
explains.  Aristotle  (38^-322  B.C.),  one  of  the  first  to  stress  science,  taught  that 
it  can  he  developed  only  through  reason.  He  sot  up  a  logical  scheme,  called  the 
syllogism,  which  severely  limited  deductions  made  from  generalizations.  Jevons 
(I87O)  describes  the  syllogism  as  follows:  "In  a  syllogism  we  so  unite  in  thought 
two  premises  or  propositions  put  forward,  that  we  are  enabled  to  draw  from  them  or 
infer,  by  means  of  the  middle  term  they  contain,  a  third  proposition  called  the  con- 
clusion." An  example  is  as  follows: 

"All  living  plants  absorb  water;         (major  premise) 
A  tree  is  a  living  plant:  (minor  premise) 

Therefore,  a  tree  absorbs  water."        (conclusion) 
The  syllogism  has  been  rejected  for  a  long  time  because  it  lead3  to  no  new  knowledge. 
It  involves  a  deductive  process,  the  conclusions  being  only  as  accurate  as  the  pre- 
mises upon  which  they  are  based.  New  generalizations  can  only  be  reached  through 
induction,  a  process  which  affords  a  means  to  attack  the  premises  themselves. 


A  --  Methods  and  Types   of  Research 


III.  Research 


In  the  broad  sense,  the  collection  and  analysis  of  data  is  research.  However,  there 
are  different  degrees  of  research  value.  Black,  et .  al .  (1923)  state  that  the  mere 
accumulation  of  facts,  computation  of  averages,  or  census -taking  is  not  research. 
Fact -gathering  alone  is  a  mechanical  procedure  unless  tied  up  with  analysis.  More- 
over, projects  designed  to  serve  purely  local  or  temporary' needs,  without  some  con- 
tribution to  fundamental  principles,  ordinarily  can  have  but  little  scientific  value, 
General  laws  or  principles  are  sought  in  research  of  the  highest  order.  There  are 
two  methods  of  research,  the  empirical  and  the  inductive.  Black,  et .  al .  (1928) 
state  that  "the  essential  difference  between  the  two  is  that  the  first  one  accepts 
superficial  relationships  without  inquiry  as  to  antecedents,  whereas  the  second  one 
pursues  antecedents  a  stage  or  two  at  least."  An  antecedent  is  a  condition  or  cir- 
cumstance that  exists  before  an  event  or  phenomenon. 

IV.  Inductive  or  Scientific  Method 

The  process  of  induction  is  of  special  importance  in  experimental  science  in  which 
general  laws  are  established  from  particular  phenomena.  The  inductive  method  is  the 
scientific  spirit  of  the  day. 

-17- 
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(a)  Explanation  of  Induction 

In  induction  one  proceeds  from  less  general,  or  even  from  individual  facts, 
to  more  general  propositions,  truths,  or  laws  of  nature.   In  other  vords,  it  is  the 
formulation  of  a  principle  from  facts.   Induction  was  the  method  of  Francis  Bacon, 
•who  held  that  general  laws  could  he  established  with  complete  certainty  by  almost 
mechanical  processes.  Bacon  advised  that  one  begin  by  collecting  facts,  classifying 
them  according  to  their  agreement  and  difference.   It  is  then  possible  to  induce  from 
their  differences  and  similarities  the  possible  reasons  for  the  relationships  exhi- 
bited and,  from  them,  arrive  at  laws  of  greater  and  greater  generality.  Thus,  the 
inductive  method  attempts  to  answer  the  question  "why".  A  knowledge  of  causes 
enables  the  scientist  to  forecast  with  greater  and  greater  assurance  because,  when 
he  knows  what  is  behind  a  set  of  relationships,  he  is  in  a  much  better  position  to 
know  whether  or  not  they  will  occur  again.  On  the  other  hand,  deduction  is  the  in- 
ference from  the  general  to  the  particular,  i.e.,  some  truth  may  allow  individual 
facts  to  be  sub-summed  under  it.  Induction  and  deduction  are  used  together  in  ex- 
perimental work.  For  instance,  a  premature  induction  may  be  made  to  account  for  a 
phenomenon.  A  hypothesis  is  set  up  that  may  or  may  not  be  faulty.  Next,  an  experi- 
ment is  designed,  a  purely  deductive  process,  to  test  this  hypothesis.  The  investi- 
gator determines  the  particular  instances  he-  may  create  and  observe  by  experiment  to 
use  as  a  basis  of  a  re -generalization  to  establish. the  original  hypothesis. 

( b )  Observation 

The  first  requisite  of  induction  is  experience  to  furnish  the  facts.  Such 
experience  may  be  obtained  by  observation  or  experiment.  Jevons  (1870)  makes  this 
statement;   "To  observe  is  merely  to  notice  events  and  changes  which  are  produced  in 
the  ordinary  course  of  nature,  without  being  able,  or  at  least  attempting,  to  control 
or  vary  these  changes."  The  botanist  usually  employs  mere  observation  when  he  ex- 
amines plants  as  they  are  met  with  in  their  natural  condition.  Progress  of  knowledge 
by  mere  observation  has  been  slow,  uncertain,  and  irregular  in  comparison  with  that 
attained  in  the  controlled  experiment.  However,  to  observe  well  is  an  art  that  is 
extremely  advantageous  in  the  pursuit  of  the  natural  sciences.  One  should  make 
accurate  discrimination  between  what  he  really  docs  observe  and  what  he  infers  from 
the  facts  observed.  The  investigator  should  be  ': uninfluenced  by  any  prejudice  or 
theory  in  correctly  recording  the  facts  observed  and  allowing  to  them  their  proper 
weight",  according  to  Jevons  (1870). 

( c )  Experimentation 

In  the  experimental  method  in  its  pure  form,  a  special  hypothetical  plan  be- 
comes the  basis  of  conclusions.  The  investigator  varies  at  will  the  combinations  of 
things  and  circumstances,  and  then  observes  the  result.  Fisher  .  (1937)  describes 
experimental  observations  as  "only  experience  carefully  planned,  in  advance,  and  de- 
signed to  form  a  secure  basis  of  new  knowledge;  that  is,,  they  are  systematically  re- 
lated to  the  body  of  knowledge  already  acquired,  and  the  results  are  deliberately 
observed,  and  put  on  record  accurately."  In  actual  practice,  the  effect  of  differ- 
ent factors  is  determined  by  holding  all  conditions  constant  or  uniform  except  the 
one  or  ones  whose  "effects"  are  to  be  measured,  a  definite  amount  of  change  in  this 
condition  being  balanced  against  a  definite  amount  of  change  in  the  result.  Black, 
et  al,  (l9?-8)  state  that  it  is  sometimes  only  the  effect  of  the  presence  or  absence 
of  a  condition  that,  is  noted.  The  method  is  qualitatively  experimental  instead  of 
quantitatively  in  such  cases.   In  many  cases,  it  is  impossible  to  hold  all  conditions 
but  one  constant  or  even  uniform.  So  statistical  analysis  is  combined  with  the  ex- 
perimental design  to  measure  variation  where  it  cannot  be  controlled.  This  is  the 
practice  in  many  agronomy  experiments.  For  instance,  when  two  or  more  wheat  varie- 
ties are  compared  for  yield,  they  are  planted  in  the  same  field,  at  the  same  time, 
and  at  the  same  rate.  Moreover,  they  are  harvested  at  the.  same  time,  threshed  by 
the  same  machine,  and  the  seed  weighed  on  the  same  balance.  The  conditions  arc  thus 
uniform  for  the  varieties  rather  than  constant.  The  importance  of  the  experiment  is 
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well  summarized  by  Jevons:  "It  is  obvious  that  experiment  is  the  most  potent  and 
direct  mode  of  obtaining  facts  where  it  can  be  applied.  We  might  have  to  wait  years 
or  centuries  to  meet  accidentally  with  facts  which  we  can  readily  produce  at  any 
moment  in  a  laboratory  . . . ." 

(d)  Essentials  of  Good  Scientific  Method 

The  essentials  in  sound  experimental  method  may  be  briefly  summarized  as 
follows : 

1.  The  formulation  of  a  trial  hypothesis,  v 

2.  A  careful  and  logical  analysis  of  the  problem  generated  by  the 
hypothesis . 

3.  Use  of  the  deductive  method  to  design  how  to  effect  a  solution 
of  the  problem.  This  involves  a  detailed  outline  of  the  experi- 
ment with  costs,  equipment,  methods,  etc.  The  factors  should  be 
expressed  in  quantitative  terms  when  possible. 

h.   Control  of  the  personal  equation. 

5.  Rigorous  and  exact  experimental' procedure  with  the  collection  of 
data  pertinent  to  the  subject. 

6.  Sound  and  logical  reasoning  as  to  how  the  conclusions  bear  on  the 
trial  hypothesis  and  in  the  formulation  of  generalizations.  A 
statement  of  the  exact  conclusions  warranted  from  the  cases  exam- 
ined should  be  made  in  accurate  terms. 

T.  A  complete  and  careful  report  of  data  and  methods  of  analysis  so 
that  others  can  check  them. 

V.  The  "Smpirical  Method 

"When  a  law  of  nature  is  ascertained  purely  by  induction  from  certain  observations  or 
experiments,  and  has  no  other  guarantee  for  its  truth)  it  is  said  to  be  an  empirical 
law,"  according  to  Jevons  (I87O).  Thus,  knowledge  is  empirical  when  one  merely  knows 
the  nature  of  phenomena  without  being  able  to  explain  the  facts.  It  only  answers 
the  question  "how".  Formerly,  the  empirical  method  represented  knowledge  secured  by 
trial,  but  today  it  means  the  haphazard  ''cut  and  try"  method.  A  person  who  learns 
certain  facts  through  repeated  observations  may  know  no  reason  for  their  being  true, 
i.e.,  he  cannot  bring  them  into  harmony  with  any  other  scientific  facts.  The  method 
is  valuable  in  spite  of  the  criticisms  against  it.  Empirical  methods  are  most  likely 
to  be  used  when  a  science  is  new.  Fact 3  must  be  gathered  before  a  notion  of  reasons 
can  be  formulated .  The  older  crop  rotation  experiments  were  empirical.  Recommenda- 
tions are  based  on  the  results,  i.e.,  certain  rotation  systems  result  in  higher  crop 
yields.  Crop  variety  tests  are  generally  empirical,  since  the  chief  concern  is  to 
determine  what  variety  yields  the  highest.*  Fully  one-half  the  agronomic  experiments 
in  this  country  are  haphazard  in  the  nature  of 'their  relationship  to  the  body  of 
known  knowledge  in  a  given  line.  Too  often  they  are  not  related  to  past  experiments. 
(See  Allen,  1930) . 

VI.  General  Types  of  Agronomic  Experiments 

Agronomic  experiments  can  be  divided  into  field  and  laboratory  or.  greenhouse  experi- 
ments. Questionaires  and  surveys  are  occasionally  used  to  secure  preliminary  infor- 
mation. - 


*Note:  In  recent  years,  variety  tests  may  involve  more  than  empiricism.  Crosses  are 
often  made  to  combine  high  yield  with  certain  desirable  quality  factors  or  disease 
resistance.  The  yield  trial  determines  whether  or  not  the  result,  has  been  accom- 
plished. 


20 

(a)  The  Field  Experiment 

The  field  experiment  involves  the  use  of  small  plots,  usually  between  l/lO 
and  l/lOOO  -  acre  in  size.  The  treatments  are  replicated,  i.e.,  repeated  on  the 
experimental  area  in  tests  designed  to  remove  the  error  due  to  soil  heterogeneity. 
To  make  other  conditions  as  uniform  as  possible,  the  varieties  or  treatments  in  the 
experiment  are  treated  as  nearly  the  same  as  possible  except  for  the  factor  or  fac- 
tors under  study.  The  field  experiment  has  a  wide  application  where  yield  is  used 
as  a  criterion  to  measure  treatment  effect.  Field  experiments  may  be  classified  as 
follows:   (1)  Variety  Tests:  Such  trials  usually  measure  the  yield  of  strains, 
varieties,  and  species.  Various  combinations  of  forage  crops  for  hay  or  pasture  are 
sometimes  classified  as  variety  tests.   (2)  Bate  and  Date  Tests:  These  experiments 
are  concerned  with  the  yield  response  of  a  variety  or  crop  when  planted  at  different 
rates  or  on  different  dates.   (3)  Crop  dotation  Tests:  These  trials  include  differ- 
ent series  of  rotations  and  crop  sequences.  jk)   Cultural  Studies :  The  time,  manner, 
and  frequency  of  field  operations  are  considered  in  such  tests.   (5)  Fertilizer  Ex- 
periments :  These  experiments  usually  include  tests  to  determine  the  needs  of  nitro- 
gen, phosphorus,  and  potassium  and  their  best  combinations.  Other  considerations 
are  ways  to  supplement  farm  manures,  value  of  cover  crops  and  green  manures,  and  the 
amounts  and  methods  of  lime  application.   (6)  Pasture  Experiments:  Field  experi- 
ments with  pastures  are  generally  used  to  study  methods  to  seed  and  fertilize  new 
pastures,  methods  to  renovate  old  pastures,  and  the  influence  of  grazing  on  species 
survival.   (See  Noll,  1928) . 

(b)  Laboratory  and  Greenhouse  Experiments 

Laboratory  and  greenhouse  tests  are  often  used  to  supplement  field  trials. 
These  tests  often  involve  potometer  and  lysimeter  studies  as  well  as  those  based  on 
special  techniques.  Pot  cultures  are  sometimes  necessary  for  the  study  of  the  effect 
of  one  factor  by  the  exclusion  of  the  others,  or  by  their  exaggeration.  However,  th^: 
sole  use  of  laboratory  experiments  may  result  in  erroneous  conclusions  when  applied 
to  field  conditions.   (S'  ".  Wheeler,  1907)  •  The  use  of  laboratories  and  greenhouses 
is  on  the  increase  because  they  have  the  advantage  of  controlled  conditions.  Some 
agronomic  problems  adapted  to  such  conditions  are:   (1)  artificial  rust  epidemics, 
(2)  toxic  effect  of  sorghums  on  crops  that  follow,  (3)  fertilizer  cultures,  (h)   re- 
sistance of  winter  wheat  to  low  temperatures,  and  (5)  moisture,  temperature,  and 
light  relationship  studies.  Equipment  for  the  study  of  hardiness  in  crop  plants  h&a 
been  described  by  Peltier  (I93I). 

Potometers  are  pots  filled  with  soil  in  which  plants  are  grown  for  experimen- 
tal pur-poses.  To  a  greater' or  less  extent  the  earlier  investigators  assumed  the 
accuracy  of  such  experiments  when  applied  to  field  conditions.  Lysimeters  are  modi- 
fied soil  tanks  used  to  measure  the  magnitude  of  nutrient  losses  from  the  soil  by 
leaching  under  various  fertilizer  and  cropping  conditions.   Installation  of  lysi- 
meter  equipment  is  expensive  but  permanent.  The  principal  feature  is  the  measure- 
ment of  drainage  water.  A  description  of  lysimeter  equipment  is  given  by  Lyon  and 
Bizzell  (1918)  and  by  the  American  Society  of  Agronomy  (1933)- 

( c )  Questionaires  and  Surveys 

Very  little  use  is  made  of  either  the  questionaire  or  survey  in  agronomic 
research.  They  are  considered  less  desirable  than  the  controlled  experiment.  The 
questionaire  consists  of  a  set  of  questions  to  be  answered  without  the  aid  of  an  in- 
vestigator (usually  mailed) .   It  is  impossible  to  secure  accurate  answers  on  ques- 
tions that  are  closely  defined  because  the  chances  for  misinterpretation  are  too 
great.  Survey  data  are  collected  with  the  personal  aid  of  an  enumera/tor  or  investi- 
gator. Spillman  (1917)  assumes  that  careful  analyses  of  the  methods  of  a  large  num- 
ber of  farmers  under  essentially  similar  soil,  climatic,  and  economic  conditions, 
may  be  made  to  reveal  the  success  of  one  person  and  the  failure  of  others.  Ho  found 
that  the  discrepancy  in  the  farmer's  knowledge  was  small  in  large  items,  but  increase-" 
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as  the  importance  to  him  decreased.  Black,  et  al  (1928)  mentions  some  of  the  weak- 
nesses of  the  survey:   (1)  It  does  not  furnish  snough  detail  for  some  types  of  prob- 
lems; (2)  It  is  not  accurate  enough  for  close  analysis;  and  (5)  It  does  not  furnish 
a  large. enough  sample  for  some  purposes. 

VII .  Hypotheses,  Theories,  and  Lavs 

The  difference  between  the  hypothesis,  the  theory,  and  the  law,  is  in  the  degree  of 
surety  or  the  absolute. 


a)   Explanation  of  these  Terms 

When  an  idea  is  suggested  by  observed  phenomena  it  is  spoken  of  as  a  hypo- 
lesis.  It  represents  a  desire  to  explain  the  phenomena  such  as,  for  example,  the 
method  by  which  plants  take  food  from  the  soil.  The  hypothesis  is  important  in  the 
deductive  method  in  that,  to  best  this  preliminary  induction,  it  is  replaced  more  or 
less  completely  by  imagining  the  existence  of  agents  which  are  thought  adequate  to 
produce  the  known  effects  in  question.  Thus,  Jevons  (I87O)  explains,  the  truth  of 
a  hypothesis  altogether  depends  upon  subsequent  verification.  A  theory  is  a  limited 
and  inadequate  verification  of  a  hypothesis.  Examples  are  the  theory  of  the  gene, 
and  the  theory  of  evolution.  A  theory  becomes  a  law  when  it  is  proved  to  be  a  fact 
beyond  a  reasonable  doubt.  The  Mendelian  laws  of  heredity  are  good  examples  of  laws, 

(b)  Formulation  of  a  Hypothesis 

There  are  certain  advantages  to  the  hypothesis:   (1)  It  correlates  facts; 
(2)  it  forecasts  other  facts;  and  (5)  it  allows  for  discrimination  between  valuable 
and  useless  information.  Every  experiment  is  the  result  of  a  tentative  hypothesis 
thought  out  in  advance  of  the  actual  test.  The  hypothesis  is  based  on  the  recogni- 
tion of  coincident  phenomena,  or  upon  a  familiarity  with  possible  causes  and  effects. 
Hibben  (1908)  states:   "Hypothesis  and  experiment  to  Charles  Darwin  were  like  a  two- 
edged  sword  which  he  employed  with  rare  skill  and  effect."  The  hypothesis  is  the 
precursor  of  the  experiment  which  is  merely  an  effort  to  solve  the  problem  created 
by  the  hypothesis. 

( c ) /Qualities  of  a  Good  Hypothesis 

There  are  several  qualities  that  a  good  hypothesis  should  possess.  These  are 
Allows:   (1)  It  should  be  plausible.   (2)  It  must  be  capable  of  proof,  i.e.,  it 
should  provide  a  susceptible  means  to  attack  the  problem  created  thereby-   (3)  It 
must  be  adequate  to  explain  the  phenomena  to  which  it  is  applied,  (h)   It  should  in- 
volve no  contradiction.   (5)  A  simple  hypothesis  is  preferable  to  a  complex  one. 
There  is  little  use  to  form  a  hypothesis  on  a  complex  basis  unless  it  is  possible  to 
collect  the  data  by  which  it  may  be  proved.  A  multiple  hypothesis  is  made  up  of 
several  ideas.  Occasionally  it  may  be  desirable  to  formulate  several  hypotheses. 
Salmon  (1928)  advises  an  investigator  to  at  least  give  consideration  to  all  observ- 
able hypotheses.  They  are  useful  even  though  wrong  because  they  eliminate  that  par- 
ticular idea  from  the  problem.  At  any  time,  an  investigator  must  be  ready  to  aban- 
don a  hypothesis  or  theory  when  further  data  prove  the  previous  views  untenable. 


d)  Null  Hypothesis 

In  all  experimentation  the  null  hypothesis  is  characteristic.  The  term  has 
been  applied  by  E.  A.  Fisher  (1937)  in  his  "Design  of  Experiments ."  The  ba aid- 
assumption  is  that  no  AAffnrnnr.R  .CTist.fl  hfitvsfin  thfi  t.T>ftn.t,rnftnt..q  in  the  experiment, 
i.e.,  they  are  samplfifl  drown  from  the  same  general  EojailatiOH*-  Vnr   instance,  in  a 
variety  test,  the  investigator  makes  the  basic  assumption  that  all  varieties  yield 
alike.  He  can  never  prove  this  assumption  but  he  may  disprove  it  in  the  course  of 
experimentation.  By  the  use  of  certain  statistical  arguments  he  may  show  a  signifi- 
cant discrepancy  from  the  hypothesis,  i.e.,  the  probability  is  that  seme  of  the 
varieties  do  differ  in  yield.  Fisher  (1937)  states:   "Every  experiment  may  be  said 
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to  exist  only  in  order  to  give  the  facta  a  chance  of  disproving  the  null  hypothesis." 

( e )  Crucial  Tests 

There  may  he  two  alternative  conceptions  or  explanations  which  appear  possi- 
ble. A  crucial  tost  ( experiment urn  crucis)  is  one  by  which  two  rival  hypotheses  can 
be  tested  so  that  if  one  is  proved,  the  other  is  immediately  disproved.  This  is  the 
only  means  by  which  a  hypothesis  may  be  disproved.  The  first  record  of  the  applica- 
tion of  the  crucial  test  is  attributed  to  Francis  Bacon.  A  good  example  of  a  crucial 
tost  was  the  one  applied  by  Richey  and  Sprague  (1931)  "to  tost  two  theories  for  the 
cause  of  hybrid  vigor  in  corn  which  is  expressed  \tfien  two  inbred  lines  of  reduced 
vigor  aru  crossed  to  give  the  first  generation  hybrid.  This  additional  vigor,  Hichoy 
(1927)  explains,  has  been  attributed  to  the  physiologic  stimulation  hypothesis  in 
which  heterogenous  germplasm  within  the  cells  provides  the  stimulation.  The  other 
hypothesis  is  that  of  dominant  growth  factors  in  which  it  is  believed  that  the  maxi- 
mum number  of  dominant  growth  factors  are  brought  together  in  the  first  generation 
hybrid,  and  that  linkages  of  favorable  dominant  growth  factors  with  other  less  de- 
sirable factors  prevented  the  recovery  of  individuals  as  vigorous  as  the  Fj_  in  sub- 
sequent generations.  Richey  and  Sprague  (1933- )  applied  a  crucial  test  to  the  two 
hypotheses  by  the  collection  of  data  on  the  principle  of  convergent  improvement,  i.e. 
backcrossing  the  Fi  hybrid  to  each  of  the  two  inbred  lines  that  went  into  the  hybrid. 
It  was  hoped  to  transfer  some  of  the  favorable  dominant  growth  factors  from  one  of 
the  lines  and  intensify  them  in  the  other.  Thus,  the  two  convergently  improved  lines 
would  have  less  differences  between  them  than  was  true  of  the  original  inbred  lines. 
Lowered  yields  of  the  cross  of  the  convergently  improved  lines,  as  compared  to  the 
cross  of  the  original  lines,  would  tend  to  support  the  physiological  stimulation 
hypothesis.  The  same  or  higher  yields  from  the  cross  of  the  convergently  improved 
lines  would  lend  support  to  the  dominant  growth  factor  hypothesis.  The  data  collect- 
ed gave  support  to  the  latter. 

B  —  Kinds  of  Evidence 

VIII .  Importance  of  Evidence 

It  is  necessary  to  collect  facts  or  data  before  generalizations  can  bo  made.  There 
are  different  kinds  of  evidence,  some  kinds  being  more  apt  to  lead  to  valid  conclu- 
sions than  others .  However,  plants  are  complex  organic  compounds  with  the  result 
that  it  is  more  difficult  to  determine  the  elements  of  cause  end  effect  than  is  or- 
dinarily true  in  the  more  stable  physical  sciences.  Environment  has  a  tremendous 
influence  on  the  plant.  The  more  that  experiments  or  observations  are  repeated  with 
the  same  results,  the  more  valid  the  evidence  becomes  in  the  minds  of  all  normal 
human  beings.  For  example,  a  large  number  of  experiments  show  that  weed  control  is 
the  principal  benefit  derived  from  cultivation.  The  fact  that  a  large  number  of 
investigators  have  found  this  to  be  true  under  different  conditions  adds  to  the 
assurance  that  the  results  are  correct.  Certain  methods  have  been  developed  to  deal 
with  the  evidence  obtained  by  observation  or  experiment  which  may  serve  as  guides  to 
those  in  search  of  general  laws  of  nature. 

IX .  Cause  and  Effect 

Induction  consists  of  inferring  general  conclusions  from  particular  evidence.   In 
some  cases,  generalizations  relate  to  cause  and  effect.  An  antecedent  is  a  condition 
which  exists  before  the  event  or  phenomenon,  while  a  consequent  follows  after  the 
antecedents  are  put  together.  Jevons  (I87O)  makes  this  statement:   "By  the  cause  of 
an  event  we  moan  the  circumstances  which  must  have  preceded  in  order  that  the  event 
should  happen.  Nor  is  it  generally  possible  to  say  that  an  event  has  one  single 
cause  and  no  more.  There  are  usually  many  different  things,  conditions  or  clrcum- 
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stances  necessary  to  the  production  of  ah  effect,  -and  all  of  them  must  be  considered 
causes  or  necessary  parts  of  the  caused'  It  is  certainly  true  that  a  multiplicity 
of  causes  is  often  involved  in  experiments  in  field  crops  and  soils. 

X .  Qualitative  Evidence 

Qualitative  evidence  is  that  which  can -"be  measured  only  categorically.  .For  example, 
seeds  either  germinate  or  fail-far  germinate.  Classification  "by  color  is  a  common 
form  of  qualitative  .data. 

( a )  Method  of  Agreement 

This  method  of  induction  is  defined  "by  Jevons  (I87O)  as  follows:   "The  sole 
invariable  antecedent  of '  a .phenomenon  is  probably  its  cause."  It  is  necessary  to 
collect  as  many  instances  as  possible  and  compare  together  their  antecedents.  The 
cne  or  more  antecedents  which  are  always  present  when  the  effect  follows  is  consider- 
ed the  cause.  For  example,  when  rust  is  present  on  wheat,  low  yields  are  obtained. 
Therefore,  rust  causes  low  yiexds.  This  method  has  a  serious  difficulty  in  that  the 
same  effect  in  different  cases  may  be  due  to  different  causes. 

(b)  Method  of  Difference 

In  this  method,  the  antecedent  which  Is  always  present  when  the  phenomenon 
follows,  and  absent  when  it  is  absent,  is'  the  cause  of  the  phenomenon  when  other 
conditions  are  held  constant .  In  ether  words,  when  the  circumstances  are  all  in 
common  except  one,  i.e.,  the  treatment,  then  the  change  that  occurs  is  the  effect  of 
the  treatment.  This  is  probably  the  most  widely  ured.  method  In  experimentation.  The 
differences  in  crop  yields  under  certain  manurial  treatments  is  an  example  of  this 
method . 

(c)  Joint  Method  ■..■•■ 

In  the  words  of  Jevons  (187O),  the  eioint  method  of.  agreement  and-  difference 
"consists  in  a  double  application  of  the  method  of  agreement,:  .first  to  a  number  of 
instances  where  an  effect  is  produced,  -and'' secondly,  to  a  number  of  quite  different  . 
instances  where  the  effect  is  not  produced."  For  example,  the  experiments  of  Darwin 
on  cross  and  self -fertilised  plants  may  be  cited,  flo  placed  a  net  around  100  heads 
to  protect  them  from  chance  insect  pollination.  He  als  a  -placed.  100' :  heads  of  tie- 
same  variety  where  they  were  exposed,  to  bees.  The  protected  flowers  failed  to  yield 
a  single  seed,  while  the  unprotected 'ones  produced  2';'20  seeds.  Thus,  cross  fertili- 
zation by  means  of  insect  pollination  was  proved,  to  be  a  cause  of  seed  set  in  this 
case . 

XI .  Quantitative  Evidence  ■'-•''•  :  .-..-.. 

Every  science,  and  "every  question  is  first  a  matter  of  generalizations  built  upon 
qualitative  evidence.  The  effort  to  more  firmly  substantiate  such  generalizations 
leads  to  the  measuring  of  evidence  quantitatively  so  that  by.  degrees,  the  evidence 
becomes  more  and  more  precisely  quantitative.  . 

(a)  Method  of  Concomitant  Variations  :       '  •'  -"v  "    ■..•"     ...:>      . 

This  method  can  be  applied  where  the  phenomena- dan  be  measured.  ■  Every  degree 
and  quantity  of  the  phenomenon  adds  new  evidence  in  support  of  relationships  that 
exist  between  antecedents  and  consequents  of  the  phenomenon.  The  method  which  em- 
ploys concomitant  variations  to  determine  the .degree-  of  such  relationship  is  called 
correlation.  For  instance,  an  experiment  with  wheat  results  in  a  low  yield  under 
conditions  of  heavy  stem  rust  infestation,  with  variations  to, the  other  extreme. 
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(t>)  Method  of  Residues 

There  may  be  several  causes,  each  of  which  produces  part  of  an  effect,  and 
where  it  may  be  desirable  to  know  how  much  of  the  effect  is  due  to  each.  This  type 
of  evidence  consists  in  the  analysis  of  a  given  phenomenon  to  determine  the  residue. 
For  instance,  manure  contains  something  besides  phosphorus,  potash,  and  nitrogen  as 
shown  by  the  residues.  In  plants  it  has  been  determined  that  other  than  the  so- 
called  10  essential  elements  are  used  because  analyses  of  the  plant  ash  show  others 
to  be  present.  The  method  of  residues  is  constantly  employed  in  chemical  determina- 
tions . 

XII.  Relation  to  the  Original  Hypothesis 

Some  experiments  fail  in  their  objective  in  that  there  is  insufficient  evidence  at 
hand  to  permit  the  investigator  to  draw  positive  conclusions.  However,  this  evi- 
dence is  valuable.  It  has  been  called  "negative  evidence,"  but  in  reality  there  is 
no  such  thing.  Research  would  be  much  further  along  than  it  is  today  if  all  experi- 
ments had  been  reported  in  which  the  evidence  was  insufficient  to  prove  the  hypo- 
thesis that  was  originally  set  up  by  the  investigator.  Such  evidence  would  have 
saved  other-  workers  from  a  repetition  of  the  work. 

XIII.  Use  of  Analogy 

Analogy  is  a  form  of  inference  in  which  it  is  reasoned  that,  if  two  (or  more)  things 
agree  with  one  another  in  one  or  more  respects,  they  will  probably  agree  in  still 
other  respects.   It  is  the  simplest  and  most  primitive  form  of  evidence,  its  great 
weakness  being  the  fact  that  the  cases  compared  may  not  be  parallel.  Analogy  may  be 
tested  by  some  inductive  method.  For  example,  the  theory  of  evolution  was  suggested 
to  Darwin  from  the  "Essay  on  Population"  by  Malthus .   It  suggested  to  him  that  the 
struggle  for  existence  is  the  inevitable  result  of  the  rapid  increase  in  organic 
beings.  The  idea  necessitated  natural  selection  or  "survival  of  the  fittest." 
Another  example  might  be  cited  in  durum  wheat.  Durum  wheat  is  adapted  to  Russia  and 
30  is  Turkey  wheat.   Since  Turkey  wheat  is  adapted  to  the  Great  Plains  in  this  coun- 
try, durum, wheat  must  be  adapted  to  this  region  also.   A  common  analogy  made  by 
agriculturists  is  that  crops  can  be  improved  by  systematic  selection  because  liv 
stock  breeders  have  succeeded  in  that  way.  Logic  derived  from  analogy  too  often  * 
-leads  the  inexperienced  astray. 

C  --  Methods  of  Discovery 

XIV.  Work  of  other  Investigators 

An  investigator  seldom  takes  up  work  today  that  is  entirely  new.  He  secures  valuable 
help  from  other  research  workers.  The  cooperative  attitude  among  the  workers  on   the 
Purnell  corn,  projects  is  particularly  commendable  in  this  respect.  They  get  together 
occasionally  to  talk  over  their  problems  freely  and -to  offer  suggestions.  They  have 
been  unusually  free  with  their  preliminary  data  and  unpublished  results  so  far  as 
fellow  workers  are  concerned.  This  attitude  has  done  much  to  advance  research  in 
corn  improvement.  The  seed  analysts  have  cooperated  among  themselves  in  a  similar 
manner.  Scientific  meetings  result  in  a  more  or  less  free  exchange  of  ideas  to  the 
benefit  of  all.  These  get-togethers  are  a  great  aid  and  should  be  attended  by  re- 
search workers . 

XV.  Surprises  and  Accidental  Discoveries 

An  important  discovery  is  quite  often  made  by  accident.   Several  examples  could  be 
-Qited. 
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(a)  Lemon  Juice  in  Grasshopper  Bait 

Some  25  years  ago,  two  workers  in  the  U.  S.  Department  of  Agriculture  were 
testing  poison  bran  mash  as  a  grasshopper  "bait  in  Kansas.  These  men  had  oranges  in 
the  lunch'  they  took  to  the  field  with  them.  While  eating  their  oranges,  some  of  the 
Juice  accidentally  came  in  contact  with  the  bran  mash.  The  men  noticed,  that  the 
grasshoppers  preferred,  the  mash  that  contained  the  orange  juice.  As  a  result  of 
this  discovery,  Kansas  came  out  with  the  lemon  juice  formula  in  1911. 

( b )  Heterothalism  in  Stem  liust  ■ 

Prior  to  1927 >    it  was  believed  that  the  pycnia  on  the  upper  surface  of  the 
barberry  leaf  had  no  function.  Craigie  (1997)  got  the  idea  that  the  mycelium, 
pycnia,  and  pycniospores  of  some  of  the  pustules  were  plus  sex  strains  and  others 
•ciinus  sex  strains.  He  happened  upon  the  proof  by  chance.  The  first  fly  of  the 
season  appeared  in  the  greenhouse  on  May  17.  He  watched  it  idly  as  it  sipped  nectar 
at  one  pustule  and  then  at  another.  Professor  Buller  happened  by  and  said  at  once: 
:'fhe  solution  of  the  problem  is  an  entomological  one.  Copy  the  fly.  Take  the  plus 
pycniospores  to  the  minus  pycnia,  and  the  minus  pycniospores  to  the  plus  pycnia." 
Craigie  followed  this  advice  by  mixing  nectar  from  different  pustules.  The  pycnio- 
spores germinated'  and  brought  on  the  development  of  aecia  and  aeciospor.es,  the 
diploid  -phase.  He  repeated  his  test  many  times  and  found  it  to  be  true.  Craigie 
proved  his  theory  as  follows:  Flies  were  introduced,  in  some  cages  containing  bar- 
berry plants  with  pustules  on  the  leaves,  while  flies  were  excluded  from  other  bar- 
berry plants.  Aecia  were  formed  in  five  days  where  the  flies  were  present,  but  none 
were  formed  where  the  flies  were  excluded. 

XVI.  Systematic  Research 

One  of  the  principal  methods  of  discovery  is  through  systematic  research  where  a 
problem  is  attacked  from  all  conceivable  angles.  An  example  is  the  contribution  of 
the  Hawaiian  Experiment  Station  on  chlorosis.  The  pineapple  industry  was  restricted 
to  a  small  area  because  of  a  discoloration  of  the  foliage  that  showed  it  to  lack 
chlorophyll.  The  investigators  on  this  problem  first  exhausted  the  possibilities  of 
disease,  after  which  they  analyzed  the  soil  and.  found  it  to  contain  considerable 
manganese.  Next,  the  workers  used  this  high-manganere  soil  on  soil  that  would  grow 
pineapples,  and  found  that  very  little  iron  was  taken  into  the  plants.  The  "manganese 
was  thus  found  to  inhibit  iron  absorption.  The  plants  were  then  sprayed  with  iron 
salts  and  the  chlorophyll  deficiency  corrected.  Pineapple  trees  are  now  sprayed  at 
the  rate  of  50  pounds  of  iron  Baits  per  acre,  the  yield  of  fruit  being  doubled,  as  a 
result . 

XVII.  Other  Methods  of  Discovercy 

Several  other  methods  have  resulted  in  significant  discoveries.   (1)  Conflicting 
Results:  Disagreement  between  different  research  workers  in  their  -results  often 
leads  to  new  discoveries.  Pasteur  became  engaged,  in  a  controversy  with  Leibig  on 
the  spontaneous  generation  of  life.  As  a  result,  Pasteur  proved  that  all  new  life 
arose  from  forms  that  had  already  existed'.  Some  of  the  most  fertile  fields  for  new 
ideas  are  the  first  new  hypotheses,  theories,  and  ideas.   (2)  Accurate  Fork:  Accur- 
ate work  is  necessary  to  secure  dependable  facts  on  which  to  base  conclusions.  More 
information  usually  results  from  work  done  carefully  than  from  that  which  has  been 
unplanned  and  carried  out  in  a  haphazard  manner .  In  addition,  the  work  of  investi- 
gators must  be  accurate  to  withstand  the  close  scrutiny  of  other  workers  and  of 
general  opinion.  Accurate  work  often  has  led  to  new  discoveries.   (3)  Analogy:  A 
fruitful  source  of  new  ideas  that  sometimes  leads  to  new  discoveries  is  analogy.  It 
may  suggest  a  hypothesis  from  the  results' secured  in  other  experiments.  ' (k)    Ideas 
from  Farmers:   In  agricultural  research,  the 'problems  called  to  "the  attention  of  ex- 
periment station  workers  by  farmers  i  s  an  important  source  of  'discovery. 
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Questions  for  Discus sion 

1.  What  is  science? 

2.  What  is  the  syllogism?  Give  an  example. 

3^  Why  has  the  syllogism  been  abandoned  in  experimental  work? 

k.   What  is  research?  Discuss  different  values  of  research. 

5-  How  does  the  inductive  method  of  science  differ  .from  the  empirical? 

6.  Why  is  it  considered  desirable  to  determine  basic  or  fundamental  lavs  rather 
than  merely  to  determine  what  happens? 

7.  Distinguish  between  induction  and  deduction. 

8.  What  part  does  observation  play  in  research  work?  What  precautions  are  necec 
eary  in  its  use? 

9-  What  is  an  experiment?  Discuss  its  use. 

10.  What  are  the  principal  steps  in  the  inductive  method  of  science?  Which  ones 
are  most  often  omitted? 

11.  Under  what  conditions  is  the  empirical  method  justified? 

12.  Name  some  types  of  agronomic  tests  that  are  empirical  in  nature. 

13.  What  serious  limitation  is  true  of  the  empirical  method? 

Ik.   What  are  some  reasons  for  criticism  of  the  methods  of  research? 
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15.  Classify  field  experiments  and  describe  each  class. 

16.  What  place  have  laboratory  and  greenhouse  tests  in  agronomic  research? 

17.  How  do  questionaires  and  surveys  differ? 

18.  Distinguish  between  potometer  and  lysimeter  tests. 

19.  Distinguish  between  hypothesis,  theory,  and  law. 

20.  Is  it  desirable  to  formulate  hypotheses  in  experimental  work?  Why? 

21.  What  qualities  are  necessary  in  a  good  hypothesis? 

22.  What  is  a  working  hypothesis? 

23.  What  advantages  are  there,  if  any,  in  formulating  multiple  hypotheses? 
2k.   What  is  the  null  hypothesis? 

25.  What  is  a  crucial  test?  Explain  one. 

26.  Why  is  research  often  more  difficult  in  plant  sciences  than  that  in  the  physical 
sciences? 

27.  Distinguish  between  cause  and  effect. 

28.  Name,  define,  and  illustrate  five  different  kinds  of  evidence. 

29.  What  is  the  most  important  inductive  method  in  experimentation?  Why? 

30.  What  is  analogy?  Discuss  its  use  and  give  an  example. 

31.  What  is  the  value  of  negative  evidence? 

32.  Mention  k   ways  in  which  discoveries  are  made. 

33*  How  was  the  cause  of  chlorosis  found  in  pineapples  in  Hawaii? 
3^.  Mention  3  discoveries  and  tell  how  they  originated. 


CHAPTER  IV 
ERRORS  IN  EXPERIMENTAL  WORK 

I .  Types  of  Experimental  Error 

Two  kinds  of  error  are  common  in  experimental  work,  systematic  errors  and  chance 
errors.  The  investigator  needs  to  be  familiar  with  both  kinds.  Such  errors  should 
be  distinguished  from  mistakes  and  blunders.  For  example,  a  worker  makes  a  mistake 
when  he  puts  down  a  weight  of  10  lbs.  when  the  scale  actually  showed  the  weight  to 
be  20  lbs. 

(a)  Systematic  Errors 

Systematic  errors  occur  every  time  that  an  experiment  is  repeated,  in  the 
same  way.  Most  experimental  plans  involve  some  errors  of  this  kind.  For  example, 
suppose  that  a  large  number  of  winter  wheat  varieties  are  arranged  systematically  in 
single-row  plots.  Some  of  the  varieties  kill  out  because  of  lack  of  hardiness.  The 
varieties  in  the  adjacent  rows  might  yield  abnormally  high  because  of  the  additional 
space  from  which  they  could  draw  moisture.  Such  an  error  would  be  repeated  every 
time  the  experiment  is  conducted  in  this  manner.  In  this  particular  case,  the  com- 
petition effect  could  have  been  avoided  by  planting  three-row  plots  for  each  variety 
and  only  the  center  row  harvested  for  yield.' 

(b)  Chance  Errors 

Errors  which  occur  by  pure  chance  with  no  definite  assigned  cause  are  known 
as  chance  errors.  They  are  generally  small  fluctuations  due  to  minor  causes.   Chance 
errors  may  accumulate  to  produce  a  sizeable  deviation  even  though  it  be  impossible 
to  foresee  and  analyze  all  causes  that  contribute  to  them.  The  principal  reason  for 
statistical  analysis  in  agronomic  science  is  its  very  inexactness  and  the  inability 
to  control  chance  errors.   In  case  the  present  theory  of  plot  technique  is  acceptable, 
the  variations  in  plot  yields  are  due  to  chance  errors  and,  in  most  cases,  have  been 
found  by  experience  to  be  normally  distributed.  This  means  that  there  are  a  large 
number  of  small  errors  and  a  small  number  of  large  errors.  Statistical  methods  are 
employed  in  field,  experiments  to  measure  the  effect  of  chance  errors.   In  addition, 
some  systematic  errors  can  be  removed  by  these  methods  as  will  be  shown  later. 

II.  Sources  of  Error  in  Experimental  Work 

Evidence  gained  by  experiment  is  disputed.,  according  to  Fisher  (1937)  either  on  the 
grounds  that  the  interpretation  is  faulty,  or  on  the  criticism  that  the  experiment 
itself  is  poorly  designed.  Errors  are  always  possible  and.  seldom  absent  in  experi- 
mentation. 

( a )  Faulty  Design  and  Inferior  T echniqu e_ 

Experimental  designs  are  inadequate  or  faulty  when  they  do  not  afford  a 
proper  opportunity  for  statistical  analysis  to  analyze  and  measure  experimental 
errors,  both  chance  and  systematic.  Fisher  states: .  "If  the  design  of  an  experiment 
is  faulty,  any  method  of  interpretation  which  makes  it  out  to  be  decisive  must  be 
faulty  too."  The  investigator  may  fail  to  take  certain  variable  factors  into  account 
Aside  from  these,  various  personal  errors  may  have  been  introduced,  such  as  careless- 
ness. Farrell  (1913)  lists  a  few  sources  of  error  in  field  experiments.  Among  the 
controllable  ones  are.   Incorrect  weights  of  crop  products,  faulty  determinations  of 
plot  area,  variations  in  quantities  of  products  recovered  and  wasted,  unobserved, 
variations  in  field  treatments,  etc.  Among  the  errors  seldom  controlled,  he  cites: 
Plant  variation,  soil  irregularities,  uneven  distribution  of  soil  moisture,  and  tem- 
perature variations.  Frequently,  the  total  effect  from  all  causes  is  great  enough  tc 
influence  the  conclusions  of  the  experiment.  It  might  be  added  that  some  o^  these 
errors  can  be  measured  and  their  influence  on  the  conclusions  removed. 

-28- 


29 

(b)  Improper  Interpretation  of  Results 

Two  common  types  of  misinterpretation  of  experimental  results  are  drawing 
conclusions  from  too  few  data,  and  carrying  the  Interpretation,  "beyond  the  points 
actually  tested.  (1)  Conclusions  drawn  from  too  few  data;  An  experiment  may  be 
inadequately  replicated  in  time  and  space  to  Justify  the  conclusions  drawn,  Carleton 
(1909)  warns  that  some  experiments  are  defective  because  they  are  run  for  an  insuf- 
ficient length  of  time.  Sometimes  investigators  are  in  too  much  of  a  hurry  to  ob- 
tain results.  Another  common  mistake  is  to  over -emphasize  small  differences.  Sta- 
tistical methods  have  done  a  great  deal  towards  reducing  invalid  inferences  due  to 
too  few  data.  (2)  Interpretation  carried  beyond  points  tested:  Sometimes  the  in- 
terpretation of  the  results  of  an  experiment  is  carried  beyond  the  points  actually 
tested.  Salmon  (I923)  believes  that  one  of  the  chief  sources  of  error  in  agronomic 
literature  is  the  tendency  to  generalize  from  experiments  limited  in  their  scope. 
For  instance,  it  should  be  quite  obvious  that  laboratory  tests  may  not  always  be 
applied  to  field  conditions.  Such  generalization  must  be  justified  by  a  similarity 
of  conditions.  As  an  example,  suppose  phosphates  were  added  to  the  soil  in  a  fer- 
tilizer test  in  amounts  of  100,  ^00,  and  600  pounds  per  acre.  One  would  be  unable 
to  draw  conclusions  on, , say  1000  pounds,  because  it  is  beyond  the  amount  tested  in 
the  experiment.  It  is  obvious  that  a  point  may  be  reached  where  the  addition  may 
have  a  depressive  effect.  Sievers  (1925)  points  out  that  recommendations  based  on 
variety  tests  conducted  under  different  conditions  as  to  soil,  climate,  and  weather 
than  those  under  which  the  farmer  operates  are  unsatisfactory. 

III.  The  Personal  Factor 

Individuals  differ  greatly  in  the  way  they  attack  problems  and  carry  out  the  various 
details  connected  with  them.  For  example,  two  men  will  seldom  agree  exactly  when 
they  make  measurements  on  the  same  thing  because  they  do  not  "see  exactly  alike". 
Such  differences  are  apt  to  be  more  pronounced  when  personal  judgment  plays  an  im- 
portant role.  The  mixing,  of  materials  illustrates  a  situation  where  individual  work- 
ers may  differ  in  the  details  of  their  procedures  to  an  extent  that  the  end-product 
is  affected.  Mechanical  devices  tend  to  do  away  with  the  personal  factor. 

When  several  individuals  work  on  an  experiment  it  is  desirable  for  the  same  person  to 
complete  an  entire  operation,  or  at  least  for  all  the  treatments  in  a  single  repli- 
cate. For  example,  in  a  variety  test  the  same  person  should  plant  the  plots,  harvest 
them,  and  make  the  weights  so  far  as  possible.  At  least,  the  same  crew  should  carry 
out  the  details  uniformly  for  all  plots  or  treatments,  preferably  for  the  entire 
test. 

IV.  Sources  of  Variation  in  Field  Experiments 

Certain  limitations  in  plot  work  must  he  recognized.  To  quote  Noll  (1928):  "The 
most  serious  are  that  the  experiments  must  be  made  under  constantly  changing  condi- 
tions as  to  moisture  and  temperature,  and  that  the  average  results  for  a  given  soil 
in  a  given  locality,  no  matter  how  carefully  planned,  are  not  necessarily  applicable 
elsewhere."  The  most  common  variations  in  field  experiments  are  those. due  to  plants, 
those  due  to  differences  in  seasons,  and  those  due.  to  the  soil.  The  variations  that 
cannot  be  balanced  out  can  be  measured  in  a  well-designed  experiment.  Some  of  those 
due  to  defined  causes,  such  as.  soil  heterogeneity,  can  be  removed  or  balanced  in  part 
but  not  entirely.  The  variations  that  are  due  to  unrecognized  causes  are  measured 
and  assigned  to  experimental,  error. 

(a)  Errors  Related  to  the  Plant  '  \. 

Variation  may  be  introduced  due  to  differences  in  acclimatization  unless  this 
factor  happens  to  be  the  one  under  study.  Differences  in  stand  may  be  a  fruitful 
source  of  variation,  particularly  in  crops  like  corn,  sorghums,  etc.,  where  plant 
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individuality  is  important.  There  are  less  corn  plants  on  a  unit  area  than  wheat 
plants.  A  further  source  of  variation  due  to  plants  is  the  difference  in  moisture 
content  of  the  harvested  crop.  Correction  to  a  uniform  moisture  basis  is  advocated 
under  such  conditions.  Plant  competition  may  introduce  still  further  error  in  plot 
■results. 

(t>)  Variations  in  Seasons 

Climate  rather  than  soil  may  he  the  limiting  factor  in  crop  production. 
Some  varieties  are  known  to  withstand. extreme  conditions  like  drouth  or  excessive 
moisture  "better  than  others.  From  uniformity  trials  with  corn  over'  a  3-year  period, 
Smith  (1909)  concluded  that  more  variation  in  yield  could  he  expected  in  seasons  un- 
favorable for  the  crop.  For  that  reason,  tests  conducted  for  only  one  or  two  years 
may  he  very  misleading.  This  situation  may  he  remedied  "by  the  extension  of  a  variety 
test  over  a  number  of  seasons  to  determine  the  variety  that  thrives  "best  in  an  aver- 
age season.  For  a  reliahle  average  of  seasonal  conditions,  a  variety  test  should  he 
conducted  for  at  least  three  years- and  pref erahly  more .  Under  dryland  conditions, 
it  takes  at  least  10  years  to  secure  a  reliahle  variety  average.  Variety  comparisons 
should  he  strictly  comparable,  i.e.,  compared  only  for  the  same  years  under  test. 
Usually  this  is  accomplished  hy  expressing-  yields  in  percent  of  the  standard  or 
check.  Other  factors  that  may  cause  the  yields  of  varieties  to  vary  from  season  to 
season  are:   (1)  The  plots  may  he  damaged  by  windstorms  one  year  and  not  in  another. 

(2)  Rodents  may  cause  more  damage  in  some  years  than  in  others.   (3)  Insects  may  he 
troublesome  in  certain  seasons,  (h)   Rust  in  small  grains  may  reduce  yields  more  in 
some  years  than  in  others.   (5)  There  may  ho  an  inaccuracy  in  scale  weights  from  one 
season  to  another.   (6)  Carelessness  in  harvesting  or  threshing  is  another  factor  in 

-some  seasons.   (7)  Sometimes  the  planter  fails  to  drill  out  to  the  end  of  a  plot  with 
a  possible  error  in  yield  as  a  result.   (8)  Crooked  rows  may  introduce  errors  in  the 
yields  of  row  crops . 

( c )  Errors  due  to  Soil  Variation 

It  is  impossible  to  secure  a  perfectly  uniform  soil  for  field  experiments. 
Differences  in  productive  capacity  commonly  occur  in  different  portions  of  the  same 
field.  In  fact,  soils  vary  in  composition  arid  productivity  from  foot  to  foot  with 
the  result  that  it  is  impossible  to  say  that  any  soil.  Is  uniform,  even  on  small 
areas.  However,  the  investigator  should  secure  as  uniform  a  piece  of  ground  as  pos- 
sible. Sedentary  soils  are  usually  more  uniform  than  drift  soils,  and  level  land 
more  likely  to  be  uniformly  productive  than  hilly  land.   Other  factors  that  may  in- 
troduce variation  are:  topography,  under -drainage,  sub-soil,;  and  previous  soil 
management  practices. 

V .  Errors  in  Laboratory-  and  Greenhouse  Tests 

There  are  many  possibilities  for  error  in  tests  of  this  kind.  Probably  the  most 
serious  one  is  to  draw  conclusions  from  laboratory  tests  for  field  conditions  with- 
out a  field  test.  Laboratory  bests  should  supplement,  rather  than  replace  the  field 
'  experiment .  « 

( a )  Errors  in  Greenhouse  T e sts ' ' 

Some  of  the  possibilities  for  error  nay  be  listed  as  follows:   (l)  The  number 
of  plants  is  small.  Plant  individuality  assumes  major  importance,  part icularly  when 
the  investigator  works  with  large  plants.   (2)  There  may  be  unequal  distribution  of 
water.   It  is  difficult  to  get  a  uniform  distribution  of  water  through  a  heavy  soil. 

(3)  It  is  often  important  that  the  exact  amount  of  water  in  a  soil  be  known.  This  is 
particularly  true  in  pots  for  freezing  tests.   (U)  There  may  be  a  lack  of  uniformity 
in  the  soil  itself.  This  may  be  alleviated  by  thoroughly  mixing  the  soil  in  a 
homogenous  mass.  The  mixed  soil  should  be  packed  uniformly  in  all  pots.   (f>.)  Berne 
insects  may  be  restricted  only  to  greenhouse  conditions.  As  a  result,  the  behavior 
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in  the  field  may  "be  entirely  different  so  far  as  insects  are  concerned.  (6)  There 
may  he  a  temperature  or  light  differential  under  controlled  conditions.  A  lack  or 
over-balance  of  either  or  both  may  introduce  a  systematic  error  in  the  experiment. 
Le  Clerg  (1935),  in  a  uniformity  trial  with  ^00  small  pots  in  a  greenhouse  experi- 
ment, found  the  per  cent  of  damping-off  in  sugar  beets  to  be  less  in  the  border -row 
pots  on  a  raised  concrete  bench  than  in  those  farther  removed  from  the  heat  pipes. 
The  effect  was  almost  absent  in  a  bench  provided  with  wall  boards  to  deflect  direct 
heat.  The  unequal  exposure  to  light  or  heat  may  be  corrected  in  some  instances  by 
rotation  of  the  pot  table  periodically. 

(b)  Comparison  of  Potometer  and  Field  Trials 

Data  from  pot  experiments  and  field  trials  were  found  by  Coffey  and  Tuttle 
(1915)  to  agree  closely  in  fertilizer  experiments.  However,,  many  fertilizer  analo- 
gies from  pot  tests  have  led  to  errors  in  interpretation.  Kezer  and  Robertson  (I927) 
found  no  agreement  between  potometers  and  field  plots  in  irrigation  studies  with 
wheat.  Potometers  with  late  irrigation  treatments  became  so  dry  that  the  soil  pulled 
away  from  the  edge  of  the  can.  When  water  was  added,  most  of  it  ran  down  the  cracks 
and  out  of  reacn  of  the  root  systems  of  the  stunted  plants. 

VI.  Statistical  Methods  in  Relation  to  Variation 

The  statistical  method  is  the  mathematical  means  to  measure  and  describe  variation 
and  to  allocate  its  component  parts  to  certain  recognized  sources.  Variation  can  be 
measured  quantitatively  thru  the  medium  of  an  experimental  design  that  takes  into 
account  the  recognizable  sources  of  variation.  The  measurement  of  total  variation 
makes  it  possible  to  obtain  a  measure  of  that  due  to  all  uncontrolled  sources.  The 
statistical  method  concludes  its  role  when  it  gives  the  experimenter  a  means  to  com- 
pare the  obtained  quantitative  measures  of  variation  due  to  the  recognized  possible 
causal  factors  with  the  variation  classified  as  error  and  also  with  each  other. 
Thus,  conclusions  can  be  drawn  in  regard  to  the  relative  importance  of  the  sources  of 
variation,  :.:".--. 

VII .  Classical  Fallacies  in  Agronomy  ■ 

A  number  of  fallacies  in  agronomy  have  been  listed  by  Salmon  (1929).  Many  of  these 
ideas  were  accepted  as  facts  until  rather  recently.  An  analysis  of  these  fallacies 
shows  how  each  came  to  be  accepted  by. agriculturists.  ;  .    ■ 

(a)  Conservation  of  Moisture  by  the  Dust  Mulch 

The  effectiveness  of  the  soil  mulch  in  the  conservation  of  soil  moisture  has 
been  under  discussion  for  many  years.  The  early  work,  on  which  the  dust  mulch  theory 
was  based,  was  performed  in  the  laboratory.  Between  I885  and  1900,  King  (1907) 
shewed  that  the  dust  mulch  was  quite  effective  in  the  reduction  of  water  evaporated 
from  the  soil  surface.  In  fact,  the  water  loss  was  about  one-half  that  from  a  bare 
soil.  However,  King  worked  in  the  laboratory  with  soil  in  tubes,  the  water  table 
being  only  22  inches  from  the  soil  surface.  On  the  basis  of  this  and  similar  experi- 
ments has  rested  the  conviction  that  the  soil  mulch  would  reduce  evaporation  losses 
and  materially  aid  in  the  conservation  of  moisture.  This  theory  was  believed  and 
practiced  until  tests  by  the  Office  of  Dry  Land  Agriculture  (USDA)  proved  that  it  was 
without  foundation.  Call  and  Sewell  (1917)  showed  that  the  soil  mulch  failed  to  in- 
crease the  moisture  in  the  soil.  In  fact,  the  mulched  plots  actually  lost  more  water 
than  bare  undisturbed  soil.  The  limit  of  capillary  rise  from  a  free  water  surface  is 
only  about  10  feet,  according  to  the  work  of  Shaw  and  Smith  (1927) .  However,  they 
found  moisture  losses  to  be  quite  rapid  from  unmulched  soil  where  the  water  table  was 
h   to  6  feet  from  the  surface.  Other  experiments  in  Illinois,  Missouri,  and  Nebraska 
kave  shown  that  corn  yielded  almost  as  much  where  the  weeds  were  scraped  with  a  hoe 
as  vVibtpo  th/=^ plots  were  cultivated  (mulched).  Shaw  (1929)  reworked  King's  experiment 
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using  soil  tubes  k  feet  high,  and  maintaining  a  constant  water  table- at  the  bast-  of 
each.  The  loss  in  the  mulched  tube  was  38  per  cent  less  than  that  from  the  tube  in 
which  the  soil  was  left  bare-.  This  test  merely  confirmed  the  fact  that  the  results 
from  these  soil  tubes  could  not  be  applied  to  field  conditions  where  the  free  water 
surface  is  usually  more  than  10  feet  from  the  soil  surface.  Under  dryland  conditions 
where  moisture  conservation  is  ertrcmely  important,  the  water  table  is  very  often 
200  to  J+00  feet  from  the  surface. 

(b)  Deep  Plowing  for  Moisture  Conservation 

The  theory  that  very  deep  plowing  will  save  moisture  by  an  increase  in  the 
storage  volume  of  the  soil  is  an  old  one  that  dates  back  to  about  1880.  It  was  some- 
times advocated  that  the  soil  be  stirred  from  Ik   to  18  inches  deep.  Deep  tillage 
was  widely  advocated  on  the  Great  Plains  along  about  1910  by  Hardy  W.  Campbell.  Most 
of  the  implements  used  were  soon  allowed  to  rust  out  in  fence  corners.  Experimenta- 
tion very  quickly  showed  that  deep  tillage  (Ik   to  18  inches  deep)  was  impractical  or 
actually  depressed  the  yields  under  dryland  conditions.  Brandon  (1925)  found. that 
winter  wheat  grown  on  plots  subsoiled  every  two  years .actually  yielded  1.3  bushels 
per  acre  less  as  a  15-year  average  than  wheat  on  land  plowed  at  ordinary  depths. 
Similar  results  were  obtained  in  Wyoming  by  Nelson  (1929). 

( c )  Continuous  Selection  of  Small  Gr a  iris 

It  was  believed  at  one  time  that  continuous  selection  was  a  means  to  invaria- 
bly improve  small  grains.  After  50  years  of  continuous  selection,  Vilmorin  concluded 
that  no  improvement  had  resulted  in  wheat,  a  self -fertilized  crop.  The  pure  line 
theory  worked  out  by  Nillson-Ehle  and  by.  Johannsen  showed  that  selection  was  effect- 
ive only  in  heterozygous  material.  This  old  idea  on  the  value  of  selection  was 
probably  due  to  a  disregard  of  the  difference  between  self  and  cross-fertilized 
plants. 

( d )  Selection  of  Seed  Corn  by  Score-  Card  S t andar ds 

Arbitrary  score  card  standards  were  improvised  in  the  early  days  as  ideals 
for  seed  selection  in  corn.  These  standards  laid  stress  on  such  points  as  shape  of 
kernel,  length  of  kernel,  ears  with  well-filled  butts,  and  tips,  percentage  of  grain 
on  the  cob,  weight  of  ear,  etc.  Uniformity  of'  jars  was  particularly  stressed.  The 
height  of  the  belief  in  the  "pretty  ear"  was  reach  xi  about  1910  when  the  most  "per- 
fect" ear  at  the  National  Corn  Show  sold  for  several  hundred  dollars.  When  planted 
in  the  field  in  comparison  with  ordinary  ears,  it  failed  to  surpass  them  either  in 
yield  or  quality.  This  started  a  great  amount  of  research  on  the  relation  of  score 
card  points  to  yield.  It  was  generally  proved  that  such  arbitrary  standards  are  of 
little  value.   In  fact,  close  selection  for  type  was  generally  shown  to  result  in  an 
approach  to  homozygosity  with  a  reduction  in  yield  and  vigor  as  a  consequence.  Some 
of  the  investigators  who  aided  in  the.  upset  of  this  theory  were:   Cunningham  (I9I6); 
Love  and  Went z  (I917);  Olson,  Bull  and  Hayos(l9l3) ;  Kiesselbach  (1922);  and  Richey 
(1925) 

( e )  C al c ium-Magne s turn  Bat i o  in  Soils 

A  physiological  balance  seems  to  be  necessary  in  nutrient  solutions  for  a. 
normal  plant  growth.  In  IS92,  Loew  proposed  the  calcium  magnesium' ratio  hypothesis. 
He  worked  out  the  optimum  ratio  for  a  number  of  different  plants  in  water  cultures. 
He  concluded  that  either  calcium  or  magnesium  used  alone  was  toxic,  but  that  the 
toxicity  disappeared  when  these  elements  fell  within  certain  limits.  The  ratios 
which  Loew  used  varied  from  1  CaO  :  1  MgO  to  "(   GaO  :  1  MgO .  A  large  amount  of  inves- 
tigation has  been  conducted  on  this  ratio  in  which  it  has  been  shown  that  a  rather 
definite  ratio  of  CaO  to  MgO  Is  required  in  nutrient  solutions  for  optimum  plant 
growth.  The  same  applies  to  other  nutrient  elements"  as  well.  However,  there  appears 
to  be  little  evidence  to  support  the  necessity  for  a  definite  ratio  of  CaO  to  MgO  in 
soils.  Recently,  Moser  (1935)  reported  that  the  ratio  itself  showed  no  relation  to 
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crop  yields.  The  "beneficial  effect  of  lime  added  to  the  soil  was  attributed  to  the 
increase  in  replaceable  calcium  rather  than  to  an  alteration  of  the  calcium-magnesium 
ratio.  It  is  sufficient  to  state  that  Loew  conducted  his  experiments  with  water  cul- 
tures which  probably  react  differently  from  soils. 

.  (f )  Addition  of  Burnt  Limestone  to  the  Soil 

It  is  still  believed  by  some  farmers  that  the  addition  of  burnt  limestone  to 
the  soil  results  in  a  destruction  of  organic  matter  and  an  increase  in  the  soil  acid- 
ity. That  burnt  limestone  increased  the  acidity  was  reported  by  the  Pennsylvania 
Experiment  Station.  The  theory,  as  taught,  was  based  on  small  analytical  differences 
in  soil  analyses. 

(g)  Acid  Phosphate  and  Soil  Acidity 

The  use  of  green  manure  and  acid  phosphate  was  at  one  time  said  to  increase 

soil  acidity.  Grass  and  green  material  were  known  to  decay  and  give  an  acid  under 

laboratory  conditions.  Careful  work  under  field  conditions  has  shown  that  bacteria 

use  up  the  organic  acid  formed.  Acid  phosphate  was  thought  to  increase  soil  acidity 

because  of  the  name.   It  has  been  changed  to  superphosphate  recently  for  psychologi- 
cal reasons. 


References 

1  Brandon,  J.  F.  Crop  Rotation  and  Cultural  Methods  at  the  Akron  Field  Station. 
Dept.  Bui.  130U,  USDA.   1925 . 

2.  Call,  L.  E.,  and  Sewell,  M.  C.  The  Soil  Mulch.  Jour.  Amer.  Soc.  Agron.  9:^9-6l. 

1917. 

3.  Carleton,  M.  A.  Limitations  in  Field  Experiments.  Proc.  Soc.  for  Agri .  Sci., 

pp.  55-61.  1909. 
k.   Coffey,  G.  N.,  and  Tuttle,  H.  F.  Pot  Tests  with  Fertilizers  Compared  with  Field 
Trials.  Jour.  Am.  Soc.  Agron.,  7:128-135*  1915. 

5.  Cunningham,  C.  C.  The  Relation  of  Ear  Characters  of  Corn  to  Yield.  Jour.  Amer. 

Soc.  Agron.,  8:188-196.  1916. 

6.  Farrell,  F.  D.   Interpreting  the  Variation  of  Plot  Yields.  CIr.  109,  BPI,  USDA, 

pp.  27-32.  1913. 

7.  Fisher,  R.  A.  Design  of  Experiments,  pp.  1-12.  1937. 

8.  Kezer,  A.,  and  Robertson,  D.  W.  The  Critical  Period  of  Applying  Irrigation  Water 

to  Wheat.  Jour.  Am.  Soc.  Agron.,  Vol.  19,  No.  2.  I927. 

9.  Kiesselbach,  T.  A.  Corn  Investigations.  Nebraska  Agr.  Exp.  Sta.  Res.  Bui.  20. 

1922. 

10.  King,  F.  H.  Physics  of  Agriculture.  1907 . 

11.  Le  Clerg,  E.  L.  Factors  Affecting  Experimental  Error  in  Greenhouse  Pot  Tests 

with  Sugar  Beets.  Phytopath.,  11:1019-1025.  1935- 

12.  Lipman,  Chas .  B.  A  Critique  of  the  Hypothesis  of  the  Lime-Magnesia  Ratio. 

Plant  World,  19:83-105,  and  119-135.'  1916. 

13.  Love,  H.  H.  and  Wentz,  J.  B.  Correlations  Between  Ear  Characters  and  Yield  in 

Corn,  Jour.  Amer.  Soc.  Agron.,  0:315-322.  1917 . 
Ik.   Moser?  F.  The  Calcium -Magnesium  Ratio  in  Soils  and  It 3  Relation  to  Crop  Growth. 
Jour.  Amer.  Soc.  Agron.,  25:265-377.  1933. 

15.  Nelson,  A.  L.  Methods  of  Winter  Wheat  Tillage.  Wyo.  Agr.  Extd  .  Sta.  Bui.  lol, 

1929. 

16.  Noll,  C.  F.  The  Type  of  Problem  Adapted  to  Field  Experimentation.  Jour.  Am.  Soc, 

Agron.,  20:^21-1+25.  1928. 

17.  Olmstead,  L.  B.  Some  Applications  of  the  Method  of  Least  Squares  to  Agricultural 

Experiments.  Jour.  Amer.  Soc.  Agron.,  6: 190-204,  191U. 


3h 

18.  Olson,  P.  J.,  Bull,  C.  P.,  and  Hayes,  E.  K.  Ear  Type  Selection  and  Yield  in 

Corn.  Minn.  Agr.  Exp.  St  a.  Bui.  l*jk.      1918. 

19.  Pichey,  F.  P.  Corn  Judging  and  the  Productiveness  of  Corn.  Jour.  Amor.  Soc . 

Agron.,  Vol.  17,  No.  6,  1925 . 

20.  Salmon,  S.  C.  Principles  of  Agronomic  Experimentation  (Unpublished  lectures) 

Kansas  State  College.  1929. 
21. Some  Limitations  in  the  Application  of  the  Method  of  Least  Squares 

to  Field  Experiments.  Jour.  Amor.  Soc.  Agron.  15:225-239.   1923* 
22.  Shaw,  C.  F.  When  the  Soil  Mulch  Conserves  Moisture.   Jour.  Amor.  Soc.  Agron., 

21:1165-1171.   I929. 
23, ,  and  Smith,  A.  Maximum  Height  of  Capillary  Pvise  Starting  with  Soil  at 

Capillary  Saturation.  Hilgar&ia,  2:599-409.   1927- 
2k.   Si  overs,  F.  J.  Outstanding  Weaknesses  in  Investigational  Work  in  Agronomy. 

•  Jour.  Am.  Soc.  Agron.,  17:88-69.   1925 . 
25,  Smith,  L.  H.  Plot  Arrangement  for  variety  Experiments  with  Corn.  Proc.  Am.  Soc. 

Agron.,  1:84-39.  1909. 


Que  sticn  s_  for  1  )is  ci'.asiori 

1.  Distinguish  between  chance  and  systematic  errors. 

2.  What  errors  in  field  experiments  can  "be  controlled? 

3.  What  kinds  of  errors  in  field,  experiments  are  not  controlled?  How  are  they  mini- 
mized? 

h.  What  errors  can  be  made  in  the  interpretation  of  experimental  results? 

5.  How  may  the  personal  factor  influence  experimental  results? 

6.  What  are  the  general  sources  of  variation  encountered  in  field  experiments? 

7.  What  factors  cause  plot  yields  to  differ  from  season  to  season? 

8.  What  errors  may  occur  in  greenhouse  tests? 

9.  How  did  the  soil  mulch  theory  originate  and,  in  the  light  of  present  knowledge, 
how  might  the  error  have  been  prevented? 

10.  Is  there  any  experimental  or  scientific  basis  for  the  belief  that  very  deep  plow- 
ing (10  inches  or  more)  is  profitable?  Explain  how  this  idea  originated, 

11.  How  did  the  belief  that  good  seed  corn  is  characterized  by  deep,  rough  kernels, 
and  cylindrical  ears  originate? 

12.  What  was  the  basis  for  the  belief  that  a  certain  calcium-magnesium  ratio  was 
necessary  for  plant  growth? 

13.  Explain  the  origin  of  the  idea  that  burned  lime  decreases  organic  matter  in  the 
soil . 

lU .  Is  there  any  reason  to  believe  that  acid  phosphate  or  green  manure  increases 

soil  acidity?  Why  was  it  thought  they  did? 
15.  Make  a  general  statement  which  will  explain  the  sources  of  error  that  have 

o c cur r e d  in  agr onomi c  s  c i e nc e . 
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Statistical  Analysis  of  Data 


CHAPTER  V     '"■'• 

,  .   FREQUENCY  DISTRIBUTIONS  AND  THEIR  APPLICATION 

I-  Measurements  and  Collection  of  Data 

Quantitative  data,  collected  as  a  result  of  measurements,  are  widely  used  in  research 
■work.  To  measure  a  quantity  is  to  determine  "by  any  means,  direct  or  indirect,  its 
ratio  to  the  unit  employed  in  expressing  the  value  of  that  quantity.   (Weld,  1916) . 
Every  measure  has  some  sort  of  linear  scale,  either  straight  or  curved,  on  which  the 
magnitudes  are  read.  This  is  because  the  human  eye  can  measure  length  far  more  ac- 
curately than  it  can  most  other  magnitudes.  However,  the  investigator  should  realize 
that  there  is  no  such  thing  as  an  exact  measurement.  Seldom  will  a  re-weight  or  re- 
measurement  give  exactD.y  the  same  quantity  because  of  inaccuracies  that  arise  from 
imperfect  apparatus  and  judgment  in  estimation.  An  observer  may  tend  to  over-esti- 
mate, or  his  measurements  may  be  prejudiced,  or  his  judgment  may  fluctuate.  Because 
it  is  next  to  impossible  to  arrive  at  a  true  value,  measurements  should  be  made  as 
carefully  as  possible  in  order  to  obtain  the  closest  approximation.  The  units  of 
measure  will  depend  upon  the  degree  of  precision  required  in  the  worlc.  One  should 
distinguish  between  errors  and  inaccuracies  due  to  carelessness.  These  are  more 
properly  called  mistakes.  They  consist  of  blunders  like  reading  the  wrong  number  on 
the  scale,  recording  a  figure  in  a  notebook  wrong,  forgetting  to  deduct  tare,  etc. 
It  is  much  easier  to  check  the  accuracy  of  weights  when  they  are  made  more  than  once. 
Sternal  vigilance  and  care  are  necessary  to  reduce  mistakes  to  the  minimum.  The  in- 
vestigator should  realize  that  it  is  impossible  to  evolve  sound  results  from  unsound 
or  carelessly  collected  data  merely  thru  the  application  of  a  formula. 

II.  Statistics  in  Experimental  Work 

After  data  are  collected,  it  becomes  desirable  to  describe  them,  interpret  them,  and 
induce  from  them.  This  is  the  realm  of  statistics.  =        ,         .■'■•■■■."-. 

(a)  Statistics  Defined 

Statistics  may  be  regarded  as  the  mathematical  analysis  based  on  the  theory 
of  probability  applied  to  observational  data  in  an  attempt  to  summarize  and  describe 
them  so  that  conclusions  can  be  drawn  concominr  the  phenomena  that  supply  the  data. 
Fisher  (193^)  states  that  the  original  meaning  of  statistics  suggests  it  was  a  study 
of  populations  of  human  beings  living  in  political  union.  The  methods  developed, 
however,  have  little  to  dc  with  political  unity.   In  fact,  they  are  applied  to  popu- 
lations, animate  or  inanimate. 

(b)  Use  of  Statistics      .:.•         ■  ■; 

Statistics  are  used  in  astronomy,  biology,  genetics,  education,  psychology, 
and  many  other  fields.  They  aro  particularly  applicable  to  data  concerned  with  life 
or  the  products  of  life.  Probably  75  to  80  per  cent  of  the  agronomic  workers  in 
agricultural  experiment  stations  use  statistical  methods,  although  only  about  one- 
half  of  these  apply  statistics  to  other  than  yield  data.  However,  statistical  methods 
are  being  used  more  extensively  as  time  goes  on.   ■:-.'.• 

III.  Some  Typical  Statistical  Terms 

The  effort  to  characterize  and  describe  the  data  mathematically  leads  to  the  calcula- 
tion of  various  statistics ..The  simplest  of  these  is  the  average  or  mean.  It  is 
natural  for  the  first  step  to  be  an  attempt  to  find  a  single  measure  which  will  best 
describe  the  sum  total  of  the  information  expressed  In  a  mass  of  data.  The  best 
single  measure  is  the  mean.  However,  it  fails  to  tell  the  entire  story.  Among  the 
other  statistics are  the  median,  mode,  average  deviation,  standard  deviation,  coeffi- 
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cient,  and  correlation  ratio.  Among  the  derived  values  from  these  stat:l"tic,t:  are 
their  standard  errors  and  probable  errors  employed  in  the  important  problem  of  esti- 
mation and  prediction. 

(1)  By  a  variable  is  meant  any  organ  or  character  which  is  capable  of  variation 
or  difference  in  size  or  kind.  This  difference  may  be  measurable  as  in  height,  tem- 
perature, weight,  etc.,  or  indirectly  as  in  the  case  of  color,  occupation,  etc. 
Variation  may  be  continuous  or  discrete*.  For  example,  a  temperature  change  from 
60  to  6l  degrees  must  pass  continuously  through  every  intermediate  state  between 
60  and  6l  degrees.   On  the  other  hand,  variation  may  take  place  by  integral  steps 
without  intermediate  values,  as  in  population  which  can  never  go  up  or  down  by  less 
than  one.   (2)  A  variate  (x)  is  an  individual  value  of  a  variable,  e.g.,  3  feet, 
200  grams,  15  pounds,  etc.   (3)  The  frequency  (f)  is  the  number  of  times  a  particular 
variate  (x)  occurs  between  two  limiting  values  of  a  variable,  i.e.,  the  number  of 
variates  in  any  one  class,  (k)   A  population  is  the  totality  of  individuals  which  are 
to  be  studied  with  regard  to  a  character  and  may  be  finite  or  infinite.   (5)  A  sample 
may  be  all  or  a  part  of  a  population.  A  random  sample  is  a  sample  taken  in  such  a 
way  that  all  individuals  which  make  up  a  population  have  an  equal  chance  of  being 
included  in  the  sample . 

IV.  Rules_  for  Computat  Vn 

It  is  desirable  to  be  consistent  in  the  number  of  decimal  places  used  in  computations, 
and  in  the  manner  of  dropping  decimals.  Suppose  it  is  desired  to  retain  two  decimal 
places.  For  a  number  like  82.575;  the  value  can  be  made  82.58  by  raising  the  odd 
number  to  an  even  number.  However,  when  the  digit  in  the  third  decimal  place  is 
greater  than  5>  the  number  is  added,  but  dropped  when  it  is  less  than  5-  For  the 
square  root  of  a  quotient  to  be  accurate  to  two  decimal  places,  it  is  recommended 
that  the  quotient  be  carried  to  four  decimal  places.  This  is  especially  important 
where  the  square  root  is  to  be  used  in  multiplications  for  other  computations. 

V.  Arithmetic  Average  or  Mean 

Masses  of  unorganized  data  explain  little  or  nothing.   Individual  measures  are  less 
significant  than  a  typical  value  which  stands  for  a  number  of  measurements.   An 
average  or  mean  is  such  a  value.   It  is  the  single  constant  most  commonly  employed  to 
describe  the  sample. 

(a)  Simple  Arithmetic  Mean 

The  mean  may  be  considered  the  center  of  gravity  of  a  sample.   It  is  equal  to 
the  sum  of  the  individual  measurements  divided  by  their  number. 

x  =  x-|  +  X2  4-  X3  +   xn    or   x  =  Sx (l) 

N  N 

where  Sx  =  the  sum  of  all  the  variates,  H  =  the  total  number  of  variates,  x  =  the 
arithmetic  mean,  and  x-i,  ^    .  . .  .xn.  the  individual  variates. 

For  example,  the  yields  of  Golden  Glow  corn  on  J  -plots  were 8V. 8,  86.9,  and  89.9 
bushels  per  acre.  The  arithmetic  mean  would  be: 

x  =  8^.8  +  86.9  +  89-9  =  87.2 
5 


*Note:  This  usage  is  somewhat  different  than  that  in  Genetics  where  a  discontinuous 
variation  refers  to  a  germinal  change  that  breeds  true,  while  a  continuous  -variation 
applies  to  variations  due  to  environment  and  non-heritable. 
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(b)  Mean  of  Beplicated  Variates1 

It  must  be  remembered'  that  the  weight  of  each  variate  must  be  equal  in  the 
sample.  When  certain  variates  .are  repeated,  the  computation  may  be  shortened  by 
merely  considering  each  distinct  variate  multiplied  by  the  number  of  times  it  appears. 
Suppose  7  corn  plants  of  variety  "A"  were  measured  for  height  in  the  first  replica- 
tion and  were  found  to  average  59  inches.  In  the  second  replication,  3  plants  were 
measured,  and  averaged  67  inches.  A  total  of  20  plants  were  measured  for  height  in 
a  third  replication  and  found  to  average  54  inches.  Suppose  one  desired  to  know  the 
average  height  for  the  variety.  A  simple  arithmetic  mean  of  59>  67,  and  54,  (i«e-> 
60  inches)  would  be  incorrect  because  a  different  number  of  plants  made  up  the  origi- 
nal means  in  the  different  replications.  The  mean  must  be  calculated  so  as  to  give 
due  weight  to  each  variate  for  the  number  of  times  that  it  occurs.  For  instance,  the 
mean  may  be  calculated  as  follows: 

*  =  (59  x  7)  +  J2k   *  20)  x  (67  x  3)  =  169^  =  56.47  in. 

30  .  "30 

The  same  result  may  be  obtained  by  the  addition  of  the  original  yd   variates  and 
dividing  by  30. 

VI .  The  Frequency  Distribution  . 

The  mean  for  replicated  variates  may  be  calculated  from  a  frequency  table  which  is  a 
simple  device  by  which  a  considerable  quantity,  of  data  may  be  organized  in  condensed 
and  classified  form.  Some  data  presented  by  Goulden  (3-937)  on  the  yields  in  grams  of 
400  barley  plots  will  be  used  to  illustrate  the  frequency  table.  The  yields  which  . 
follow  represent  an  aggregate  of  data  in  which  there  are  400  variates.  Each  measure- 
ment is  a  variate,  i.e.,  a  particular  measured  value  of  the  variable  (x)  yield. 

Yields  in  Grams  of  400  Square. _Yar d_  Plots  of  Barley^- 

135  162  I36  157  l4l  130  129  176  171  190  157  1^7  176  126  175  13^  I69  I89  180  128 

169  205  129  117  144  125  165  170  153  186  164  123  165  203  156  182  164  176  176  150 
216  154  184  203  166  155  215  190  164  204  194  148  162  146  174  185  171  181  158  147 
165  157  180  165  127  186  133  170  134  177  109  169  128  152  165  139  146  144  178  188 
133  128  161  160  167  156  125  162  128  103  116  87  123  143  130  119  141  174  157  168. 
195  180  158  139  139  168  145  166  118  171  143  132  126  171  176  115  165  147  186  157 
187  174  172  191  155  169  139  144  130  146  159  164  160  122  175  156  119  135  116  134  ■•', 
157  182  209  136  153  160  142  179  125  149  171  186  196  175  189  214  169  166  164  195 
189  108  118  149  178  171  151  192  127  143  158  174  191  134  188  248  164  206  135.  192 
147  178  189  141  173  187  167  128  139  152  167  131  203  231  214  177  161  194  141  161 
124  130  112  122  192  155  196  179  166  156  13'L  179  201  122  207  189  1.64  131  211  172 

170  140  156  199  181  181  150  184  154  200  I87  169  155  107  143  145  190  176  162  123 
189  194  146  2£  160  107  70,  "34  112  162  124  136  138  101  138  l4l  l43  135163  1S3 

99  118  150  151  33  136  171  191  155  164  98  136  115  168  130  111  136  129  122  120 

179  172  192  171  151  142  193  174  146  180  140  137  138  194  109  120  124  126  126  147 

115  148  195  154  149  139  163  118  126  127  139  174  167  175  179  172  174  3.67  142  169 

122  163  144  147  123  160  137  161  122  101  158  103  119  3.64  112  57 '  >  (§3)  106  132  122 

164  142  155  147  115  143  68.  184  183  167  160  138  191  153  loO  156  122  111  153  -]-43 

103  131  180  142  191  175  146  101  ••111  ■•  110  154  176  168  175  175  146  148  167  106  123 

121154  148  91  93  74  113  79  131U9  96  86  97  98  106  107  69  86  94  129 

/r  "  •  ■  :-\  ':.,'  Mi- 


** 


l-This  has  been  sometimes  called  a  "weighted"  mean.  ;■'■'.•; 

2Data  from  Methods  of  Statistical  Analysis  by  C.  H.  Goulden,  p.  7,  1937. 


ko  : 

(a)  Grouping  of  Data  into  Classes 

The  above  data  are  unwieldy  in  their  present  form,  even  though  quite. simple 
in  nature.  They  may  he  condensed  by  grouping.  First,  find  the  highest  and  lowest 
values  of  the  variates  (barley  yields  in  grams).  The  interval  thus  defined  by  these 
extreme  values  is  known  as  the  range.   In  this  case  it  is  22  to  2l+8.  The  next  step 
in  the  formulation  of  a  frequency.'-  table  or  distribution  is  to  separate  the  range  into 
classes.  Although  unnecessary,  it  is  usually  convenient  for  the  classes  to  have 
equal  range  (interval)  within  themselves.  The  number  of  classes  to  be  formed  is  the 
next  question.   Experience  has  shown  that,  somewhere  between  7  and  20  classes  is  a 
desirable  number  with  which  to  work.  The  smaller  the  number  of  classes  the  greater 
is  the  error  due  to  grouping.  The  approximate  number  of  classes  can  be  determined 
from  a  formula  given  by  Yule  (1929) : 

Number  of  Classes  =  2,p    yNumber  in  Sample  =  2.5  hJkOO  =   11.18 

Suppose  12  classes  are  decided  upon.  The  quotient  of  the  range  divided  by  the  number 
of  classes  is  the  approximate  class  interval,  viz,,,  226/12  =  18.  '  However,  a  class 
interval  with  an  odd  number  is  more  convenient  because  the  midpoint  of  the  range  does 
not  require  an  additional  decimal.  Suppose  19  is  selected  as  the  class  interval. 
The  value  of  a  class  is  taken  at  its  mid -value .  The  barley  data  may  be  tabulated  for 
a  class  interval  of  19  as  follows: 
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22-1+0 
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50 

6q 
83 
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li+5 
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202 
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2 1+0 


1 
1 
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itia  mi  11 

IHl  11-11  IHI  11-11   Kbil  mi  1 

(This  tabulation  can  be 
continued  in   Like  maimer 
for  the  other  variates.) 
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.     N     =      1+00 
(b)  Frequency  Table 

After  the  data  are  tabulated  they  are  next  arranged 'in  a  frequency  table, 
i.e.,  the  frequencies  are  entered  to  correspond  to  their  class  values. 
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The  mean  (x)  for  this  sample  can  be  con- 
veniently calculated  from  the  frequency 
table.  Each  class  value  is  multiplied  by 
its  frequency  (f)  to  five  fx.  These  values 
are  summed  to  give  S  (fx)  and  divided  by 
the  total  number  in  the  sample.  For  the 
barley  yi 


H  o"! 


G.lCLS  ■ 
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60 ,812    == 
"1+00 


.Vii 


■'O 


S(fx);    = 

N 

It   should  be  evident  that  the  classification 
of  the  data  into  a  frequency  distribution  ha? 
distorted  them  from  their  original  form. 


kl 


■   (c)  Graphical  Representation  of  Frequency  Table  . 

A  visible  representation  of  a  large  number  of  measurements  is  afforded  "by 
either  a  histogram  or  a  frequency  polygon. 

The  histogram  is  most  commonly  used.  The  character  to  be  measured  is  repre- 
sented along  the  horizontal  axis  (abscissa),  while  the  frequencies  are  represented 
vertically  (Ordinate)  to  correspond  to  each  class.  For  example,  the  barley  yield 
data  may  be  plotted  as  follows: 
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The  frequency  polygon  is  constructed  by  joining  in  sequence  the  midpoints  of  the 
tops  of  the  bars  of  the  histogram.  Its  shape  tends  towards  the  smooth  curve  of  the 
population  from  which  the  sample  was  drawn.  The  frequency  polygon  for  the  barley 
yield  data  is  as  follows: 
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VII.  Measures  of  Central  Tendency 

There  are  three  measures  of  central  tendency  that  must  he  defined  at  this  point. 
(1)  The  arithmetic  mean,  already  discussed,  is  the  center  of  gravity  of  the  popula- 
tion.  (2)  The  median  is  the  measure  of  the  middle  variate  in  an  ordered  arrangement 
of  the  variates  according  to  magnitude.   (3)  The  mode  is  the  measure  of  the  class  of 
greatest  frequency,  or  the  point  at  -which  the  most  variates  occur.   In  other  words, 
it  is  the  x-value  at  which  the  frequency  polygon  has  the  highest  ordinate. 

VIII .  Types  of  Frequency  Distributions 

Before  one  goes  further  with  the  analysis  to  describe  the  nature  of  the  aggregate  of 
the  data,  it  is  necessary  to  roughly  determine  the  type  of  frequency  distribution. 
Some  mathematical  expression  is  essentia.!  corresponding  to  those  types  most  often  en- 
countered in  actual  practice.   (1)  A  great  many  frequency  distributions  found  in 
practice  are  unimodal,  i.e.,  have  one  peak.   (2.)  There  is  a  general  tendency  for  them 
to  be  hell -shaped  when  the  frequency  polygon  or  diagram  is  smooth.   It  -was  early- 
noticed  that  the  curve  derived  from  tin.,  theoretical  distribution  of  the  expansion  of 
a  "binomial,  (a  -<-  h)n,  possessed  many  of  the  same  characteristics  of  frequency  distri- 
butions met  with  in  actual  practice.  However,  the  "binomial  distribution  fails  to 
represent  continuous  variation.  An  effort  to  find  a  mathematical  equation  for  a 
curve  which  would  well  fit  the  points  of  a  binomial  d:i stribiat ion  led  to  the  discovery 
of  what  is  known  as  the  normal  probability  curve  and  its  equation.  Types  of  distri- 
butions most  commonly  approached  in  thu  graphical  representation  of  data  are  the 
normal,  binomial,  and  the  Poisson  distributions. 


(a)  NormaJ-  Distribution 

The  normal  curve  is  a  bell-shaped,  symmetrical  curve.   It  is  characterized 
by  the  symmetrical  arrangement  of  the  items  around  the  central  value.  The  arithmetic 
mean,  median,  and  mode  coincide  in  the  norma,!  curve.  As  in  the  case  of  mam'-  frequency 
distributions,  the  small  deviations  from  the  central  value  (mean)  occur  more  frequent- 
ly and  the  larger  deviations  less  frequently.  Fisher  (193*0  gives  the  statures  of 
3  375  women  in  a.  curve  that  closely  approaches,  a  normal  curve. 
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(b)   B inomial  D i st ribut i on 

The  binomial'  distribution  is  represented  by  the  expansion  of  the  bin 
(p  +  l)n  .  To  understand  the  application  of  the  binomial  distribution  to  da 
first   necessary  to  make  some   study  of  probability.     This   subject  will  be  tre 
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ta, 
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X    is 
later. 


*Note:    (p   +  q)n  =  pn  +  n-pn_1   q  ■*■  n(n-l)    p^-2q2 


+  n(n-l)(n'~2'. 
1.2.3 
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(c)  Poisson  Distribution 

The  Poisson  distribution  is  biometrically  unsymmetrical,  i.e.,  it  is  extreme- 
ly skew.  This  type  of  distribution  results  from  an  attempt  to  represent  the  expan- 
sion of  (p  +  q)n  when  "p"  is  extremely  small.  This  type  seems  particularly  applic- 
able to  purity  and  germination  counts  in  seed  testing,  as  well  as  many  other  appli- 
cations. 

•  (3)  Other  Types  of  Distributions 

Sometimes  two  or  more  factors  influence  the  shape  of  a  frequency  distribution 
so  that  it  has  two  peaks.  This  would  be  a  bimodal  curve.  When  the  data  which  pro- 
vide two  unimodal  frequency  distributions  with  two  substantially  different  means  are 
combined  into  one  frequency  distribution,  the  distribution  that  results  may  be  bimo- 
dal due  to  the  fact  that  nonhomogenous  data  over-lap.*  This  happens  occasionally  in 
genetic  data. 

IX.  Some  Constants  used  to  Describe  Distributions 

There  are  several  constants  or  statistics  used  to  describe  distributions.  Those  of 
position  or  central  tendency  (mean,  mode,  median)  have  been  discussed  already.  The 
constants  commonly  used  to  measure  dispersion  of  the  variates  are  the  standard  devia- 
tion, quartile  deviation,  and  the  average  deviation. 

(a)  Standard  Deviation 

The  standard  deviation  of  the  sample  (s')  is  most  frequently  used  in  statis- 
tical work  to  measure  dispersion.  It  is  sometimes  called  the  standard  error  of  a 
single  observation.  The  squared  standard  deviation  (s')2  is  the  sum  of  the  squares 
of  the  deviations  from  the  mean  divided  by  the  number.  This  is  sometimes  called 
variance,  or  the  second  moment  about  the  mean. 

(s')^  (variance)  =  u2  =  S(x  -  x)^    _„_ (2) 

N 

where  u2  is  the  second  moment. 

The  standard  deviation  (s')  is  the  square  root  of  the  variance.  The  formula,  for  the 
standard  deviation  may  be  expressed  as  follows: 


=     lag  +   dg    +  d§  +  . .  .ag     =     /sa£         or      st£     ll™™L™ (3) 

where  d  is  the  deviation  from  the  mean,  e.g.,  &i  »  x-j_  -  x. 


The  above  formula  gives  the  standard  deviation  of  the.  sample  about  its  mean.  When  it 
is  desired  to  use  this  result  as  an  estimate  of  the  standard  deviation  of  the  popula- 
tion (s)  about  its  mean  (m),  N-l  should  be  used  in  the  denominator  instead  of  N. 
This  makes  little  difference  in  the  result  when  the  sample  is  large,  but  N-l  should  be 
used  when  the  sample  is  small,  ile.,  when  N  is  less  than  ">0  as  an  arbitrary  rule. 

As  an  example,  the  calculation  of  the  standard  deviation  of  a  sample  (s')  can  be 
illustrated  with  the  barley  yields  as  grouped  in  VI  (b)  above.  The  deviations  for 
each  class  are  taken  from  the  actual  means,  i.e.,  152. 


*Hote:  Pearson's  generalized  frequency  curves  or  the  Gram-Charlicr  method  of  curve- 
fitting  should  be  used  for  a  finer  method  of  analysis  for  such  distributions. 
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Thus,  31*27  is  the  standard  deviation  (s')  of  this  sample.  However,  the  best  esti- 
mate for  the  standard  deviation  for  the  population  (cr)from  which  this  sample  was 


drawn ,  would  be:! 


=  /sfd2 
V  N-I 


32^052  =31.51 


Another  formula  for  the  calculation  of  the  standard  deviation  of  the  sample  (sf)  has 
been  recommended  by  J.  Arthur  Harris  for  machine  calculation: 


s '  = 


bZ  1 

57 


W 


This  formula  is  essentially  the  same  as  the  one  given  above  except  that  the  variates 
themselves  are  used  rather  than  their  deviations  from  the  mean. 2 

The  calculation  of  the   standard  deviation  of  the  sample   (s')  by  this  formula  is  illus- 
trated with  the  barley  yield  data  as  follows: 


Note:x  The  estimate   (s)    of  the  population  standard  deviation  (a)  may  be  computed  from 
the  sample  standard  deviation  (s1): 
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(b)  Coefficient  of  Variability  (C.V.)  is  the  standard  deviation  of  the  sample 
(a1)  expressed  in  percentage  of  the  mean.  This  gives  a  relative  measure  of  disper- 
sion so  that  variation  may  be  compared  in  features  expressed  in  different  units  of 
measurement.  It  would  be  often  impossible  to  compare  the  variabilities  of  two  ex- 
periments unless  it  was  expressed  in  a  common  unit.  The  formula  is  as  follows: 

C.  V.  (Coefficient  of  Variability)  =100  s'    - (5) 

x 
For  the  barley  data,  it  is  as  follows: 

C.  V.  =   (31.27)(100)     _   go  57 
152.03 

X.  Sheppard's  Correction  for  Grouped  Data  '".,, 

An  error  is  introduced  by  grouping  variates  into  classes  due  to  the  fact  that  the 
midpoint  of  the  class  is  likely  to  deviate  from  the  mean  of  the  distribution  by  more 
than  the  mean  of  the  variates  grouped  in  the  class  in  question.  This  is  particularly 
true  for  the  extreme  classes.  The  majority  of  the  variates  in  a  class  are  grouped  on 
the  side  nearest  the  mean  of  the  distribution.  This  error  can  be  compensated  for 
mathematically  by  the  use  of  Sheppard's  correction.  This  correction  is  equal  to  l/l2 
of  the  class  interval  (C),  and  is  subtracted  from  the  value  of  the  squared  standard 
deviation  (s1)2  as  ordinarily  obtained,  i.e.,  (s*)2  -  C2/l2.  However,  Sheppard's 
Correction  is  applicable  only  to  large  samples  where  the. variables  are  continuous. 

To  calculate  the  standard  deviation  without  Sheppard's  Correction,  is  to  assume  that 
the  variates  in  each  class  are  grouped  with  the  highest  frequency  at  the  mean  of  the 
class  as  shown  in  the  diagram.  To  do  this  evidently  leads  to  an  error  in  that  s' 
will  be  computed  larger  than  it  actually  is.  Sheppard's  Correction  compensates  for 
this  type  of  error  which  results  from  grouping  data  in  a  frequency  distribution. 


46 


XI.  Short -Cut  Methods  for  Computation  of  Statistl 


OS 


So  far  the  statistics  for  simple  frequency  distributions  have  been  calculated.  Sever- 
al short-cut  methods  are  used  which  greatly  reduce  the  labor  of  computation.  These 
methods  give  the  same  results.  Usually  the  computations  are  made  from  an  arbitrary 
origin  or  guess  mean  (v),  with  the  guess  mean  corrected  to  give  the  true  mean  (x)  of 
the  sample.  The  guess  mean  can  be  taken  at  any  position.  -  Usually  it  is  taken  at 
the  middle  of  the  range  or  at  the  lowest  class. 

The  method  of  computation  by  use  of  an  arbitrary  origin,  or  guess  mean,  can  be  shown 
with  the  barley  yield  data. 
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Note:   It  can  be  readily  proved  that  a  guess  mean  can  be  used  provided  a  correction 
is  applied  to  obtain  the  true  mean.  Let  x  =  the  true  mean,  w  =  the  guess 
mean,  C  =  a  constant  (class  interval),  d  =  the  deviation  from  the  guess  mean, 
and  N  =  the  number . 


x  =  Cd  +  v 

x  =  Sfx  =  Sf  (Cd  +  w)  = 
W  N 

-   Cd  +  w 


C  S(fd)  +  w  s(f) 
N         N 


since  S  (f)  =  N 
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Symbols:  w  =  guess  mean,'  d  or  Sfd  »   correction  to  the  guess  at  the  mean,  and 
C  =  class  interval. 

S  =  Sfd  =  2548  =  6.37 
H      400 


x     =     v  (guess  mean)   +  CcL    . 

=     31  +  (19)   (6.37)  =31  +  121.03  =  132.03 


8'   =     C     /sfd2      - 

-o 

cr-           or 

C    /sfd2        -       /Sfd\2 
V     N                    \  H J 

=    19     /i7^. 
"V     4oo 

=     (19)     (1.6456) 
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=      31.2664. 

19  /   43.2850  -  40.5769 

Sheppard's  Correction: 


s'    (corrected)     =       /(s')2   -  _P_2  =     fan .5&lQ     -     ?6l 

V       12       V  12 


=  /9T7.5878  -  30.0833  =  / 947-5045  =  30.7816 

c.  v.  =  loo  s*  =  (30.7816)  (100)  =  3078.16  =  20.2471 

x  152.03        152,03 

The  arbitrary  origin  in  this  case  was  taken  at  the  first  class.  The  calculation  in- 
volves larger  numbers  than  when  taken  near  the  center  of  the  range,  but  all  numbers 
are  positive.  ■'■• 

XII.  General  Applicability  of  Statistical  Methods 

Knowledge  of  the  frequency  distribution  Isads  to  an  elementary  insight  into  the  sta- 
tistical process.  The  methods  of  statistics  must  be  applied  with  caution  to  experi- 
mental data.    '•"■  ■;  '•  -  .'.-  .      .-..,.  •.-'••• 

(a)  Mathematical  Basis  for  Application 

The  methods- of  statistics  comprise  the  application  of  the  solutions  affected 
by  the  calculus  of  probability  to  precisely  stated  mathematical  problems  in  the 
attempt  to  answer  questions  connected  with  actual  experiments.  For  the  methods  of 
statistics  to  validly  apply  to  the  practical  problems  connected  with  experimental 
work  it  is  necessary  that  a  high  degree  of  correspondence  exist  between  the  realities 
observed  in  phenomena  and.  the  abstract  but  very  definite  concepts  upon  which  the 
mathematical  solution  of  the  problem  is  based.  The  one  possible  way  to  be  certain  of 
a  correspondence  is  to  carry  out  repeated  random  experiments.  Statistical  methods 
may  be  employed  to  answer  questions  and  test  hypotheses  that  concern  phenomena  ob- 
served in  experimental  work  when  this  correspondence  is  satisfactory.  The  principal 
cause  of  the  misapplication  of  the  statistical  method  is  the  fact  that  it  is  often 
merely  assumed  that  a  correspondence  exists  between  measurements  and  observations 
concerned  with  phenomena  that  result  from  experiments  and  the  abstract  concepts  of 
the  probability  theory  employed  to  produce  the  statistical  method  used  in  the  inter- 
pretation of  the  experimental  results.   (See  Keyman,  1937)  •   ■■ 
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(b)  Value  of  the  Statistical  Method 

There  are  many  advantages  attributed  to  the  use  of  the  statistical  method. 

(1)  It  provides  a  sound  "basis  for  the  formulation  of  experimental  designs.  Goulden 
(1937)  makes  this  statement:   "The  experiment  that  has  "been  correctly  designed  gives 
maximum  efficiency,  an  unprejudiced  estimate  of  the  errors  of  the  experiment,  and 
yields  results  not  only  on  the  primary  factors  with  which  the  experiment  is  concerned, 
"but  also  on  the  important  inter-relations  of  these  factors."   (2)  It  tends  to  elimi- 
nate the  personal  equation,  i.e.,  it  does  away  with  differences  'in  personal  inter- 
pretation.  (3)  The  statistical  method  is  useful  in  the  reduction  and  condensation  of 
data.  Fisher  (193^)  states- that  no  human  mind  is  able  to  grasp  in  its  entirety  the 
meaning  of  any  considerable  quantity  of  numerical  data.  It  allows  one  to  express 
relevant  information  by  means  of  comparatively  few  numerical  values,  (k)    It  affords 

a  means  to  measure  and  evaluate  chance  errors.  This  is  probably  the  outstanding  con- 
tribution of  statistics.   (5)  The  statistical  method  affords  one  of  the  best  measures 
of  concomitant  variations,  i.e.,  correlation.   (6)  It  gives  a  quantitative  measure 
of  variation,  including  chance  variation.  Statistics  are  widely  used  in  genetics 
for  this  purpose.  .  .. 

(c)  Reliability  of  the  Statistical  Cjmst_arrt 

The  reliability  that  can  be  placed  on  statistical  constants  depends,  in  many 
cases,  on  the  type  of  data  being  analyzed.  However,  several  factors  contribute  to 
reliability,   (l)  Reliability  depends  on  the  accuracy  of  the  measurements.   (2)  Quan- 
titative data  are  likely  to  be  more  accurately  measured  than  qualitative  data. 
(3)  Samples  collected  at  random  are  usually  more  reliable  than  those  selected  by 
other  means,  although  samples  by  design  in  planned  arrangements  are  very  good. •  (k)   A 
large  sample  is  more  likely  to  be  representative  than  a  small  one.  Arbitrarily,  pop- 
ulations of  less  than  100  individuals  or  variates  ordinarily  are  considered  small 
samples  to  which  special  precautions  should  be  applied.   (Fisher,  193^) •   Conclusions 
drawn  from  many  of  the  older  field  experiments  are  questionable  because  there  were 
too  many  different  kinds  of  treatments  and  too  little  replication  or  repetition. 
Statistical  methods  have  done  much  in  recent  years  to  increase  the  reliability  of 
field  experiments.  The  difficulty  of  small  samples  has  been  alleviated  in  many  in- 
stances by  the  calculation  of  a  generalized  standard  error  based  on  all  the  plots  of 
the  experiment.  Harris  (1930)  claims  that  many  agronomic  experiments  can  be  organ- 
ized to  "make  possible  the  application  of  the  powerful  methods  of  'biometric  descrip- 
tion and  analysis." 

(d)  Some  Misconceptions  of  the  Statistical  Method  ■„ 

There  is  little  question  about  the  value  of  the  statistical  method  as  such, 
but  much  question  as  to-  its  application.  The  statistical  method  cannot  correct  poor 
technic  or  be  applied  indiscriminately.  The  standard  error  of  a  statistical  constant 
fails  to  measure  the  accuracy  of  an  experiment  unless,  all  errors  ('personal  equation) 
have  been  eliminated  except  those  due  to  chance.  The  statistical  method  may  eliminate 
some  systematic  errors,  but  to  no  great  extent.  An  effective  way  tc  eliminate  syste- 
matic errors,  or  at  least  to  discover  them,  is  to  repeat  the  experiment  in  a  different 
manner.  Statistics  may  lend  support,  to  a  hypothesis  but  does  not  necessarily  prove 
it.   Several 'years  a.go,  arguments  on  the  use  of  statistical  methods  in  agricultural 
research  were  quite  common.  The  mathematical  foundations  .of  the  statistical  formulae 
are  now  regarded  as  well  established,  but  argument  on  the  proper  application  of  cer- 
tain statistical  measures  will  continue  much  as  it  does  in  experimental  technic 
generally.  Blind  application  of  statistical  procedures,  as  with  any  other  technic, 
is  harmful.  Common  sense  and  good  judgment. are  vital,  in  all  phases  of  experimental 
work.  Salmon  (1929)  points  out  that  statistical  treatment  in  Itself  is  seldom  satis- 
factory because:   (l)  The  observed  result  may  not  be  due  to  the  assigned  cause. 

(2)  The  laws  of  chance  are  often  an  unsatisfactory  basis  for  action  or  for  specific 
advice.   (3)  Many  experiments  do  not  furnish  results  which  readily  lend  themselves 
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to  statistical  treatment  because  of  bias,  lack  of-  randomness,  or  paucity  of  the 
observations,  (k)  Most  experiments  furnish  evidence  supplementary  to  the  main  issue 
which  is  of  the  greatest  value  for  the  arrival  at  a  reasonable  interpretation  of  the 
results.  This  type  of  statement  is  answered  by  Goulden  (1937)  who  "doubts  very  ser- 
iously the  contention  that  all  really  worthwhile  effects  are  obviously  significant. 
At  any  rate  this  is  at  best  a  dangerous  concept  as  evidenced  from  scores  of  examples 
in  published  papers  where  conclusions  have  been  drawn  that  can  be  proved  by  the  data 

to  have  very  little  foundation Thus,  the  experimentalist  who  states 

that  his  results  are  so  obvious  that  they  do  not  require  tests  of  significance  is 
merely  stating  that  in  his  experience  with  such  experiments,  differences  as  great  as 
those  obtained  are  very  unlikely  to  have  arisen  by  chance  variation.  We  have  no 
quarrel  with  this  reasoning  in  that  it  is  exactly  the  type  of  reasoning  employed  in 
tests  of  significance.  Our  contention  is  merely  that  a  determination  of  probability 
based  on  a  measure  of  variability  furnished  by  the  experiment  itself  is  sound  experi- 
mental logic  and  vastly  superior  to  any  method  based  on  pure  guesswork." 
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Questions  for  Discussion 

1.  Explain  why  there  is  no  such  thing  as  exact  measurement  in  quantitative  data. 

2.  Distinguish  between  errors  and  blunders  in  measurements. 

3.  Define  statistics.  Why  are  some  typical  statistical  constants? 

4.  In  what  branches  of  science  have  modern  statistical  methods  been  most  extensively 
used?  Why? 

5.  Define  these  terms:  variable,  variate,  frequency,  population,  and  sample. 

6.  What  is  the  mean?  How  does  it  differ  from  the  so-called  weighted  mean? 

7.  What  is  a  frequency  distribution  or  frequency  table? 

3.  What  is  a  class  interval?  How  would  you  determine  it  for  an  array  of  data? 
9.  What  is  the  difference  between  a  histogram  and  a  frequency  polygon? 
10.  Give  3  measures  of  central  tendency  and  distinguish  between  them. 


50 

11.  What  is  a  normal  curve?  Skew  curve?  Bimodal  curve? 

12.  Distinguish  between  the  binomial,  normal,  and  Poisson  distributions. 

13.  Define  the  standard  deviation  of  the  sample.  Population. 

Ik.   What  is  the  best  estimate  of  the  standard  deviation,  of  the  population  aa.  obtained 
from  the  sample? 

15 .  Prove  that 
'-■d2    =   /5x2       /Sx\2 

N        V  N         \  N/ 

16.  What  is  the  coefficient  of  variability?  When  is  it  correctly  used? 

17.  What  is  meant  by  an  arbitrary  origin?  Where  can  it  be  taken?  Why? 

18.  Explain  Sheppard's  Correction  and  the  reason  for-  its  use. 

19.  What  are  some  of  the  specific  things  that  statistical  methods  are  expected  to  do 
when  properly  applied  to  data? 

20.  What  are  some  of  the  difficulties  likely  to  be  encountered  in  applying  statistical 
methods  to  field  experiments? 

21.  What  factors  contribute  to  the  reliability  of  statistical  constants? 

22.  Is  the  evidence  afforded  by  statistical  analysis  of  data  negative  or  positive? 
Explain. 

23.  Why  have  statistical  methods  been  only  partially  used  in  agronomy?  Name  3  men 
who  have  advocated  such  methods  in  this  field. 

2h .   What  are  the  principal  arguments  of  that  school  of  opinion  which  favors  (or  in- 
sists) on  the  application  of  modern  statistical  methods  to  field  experiments? 

2p.  What  are  the  principal  arguments  of  those  who  do  not  favor  the  use  of  such 
methods? 

26.  What  is  generally  indicated  when  "common  sense"  and  interpretation  based  on  sta- 


tistical methods  do  not  agree' 


Problems 


In  determining  the  moisture  content  of  corn  by  the  Brcwn-Duvall  moisture  tester, 
the  common  practice  is  to  base  the  moisture  percentage  on  the  total  or  wet  weight 
(corn  plus  moisture)  of  the  corn.  The  moisture  content  of  hay,  however, is  often 
expressed  as  a  percentage  of  the  dry  weight  of  the  hay. 

(a)  A  variety  of  corn  produced  I.7.2  lbs.  of  shelled  corn  that  contained  1^.0 

per  cent  moisture  on  a  12 -hill  plot.  The  hills  were  3x3  feet  apart.  Calculate 
the  yield  of  shelled  corn  in  bushels  per  acre  on  a  15". 5  'Pe~f   cent  moisture  basis. 

(b)  A  twentieth  acre  plot  of  hay  produced  .120  pounds  of  field  cured  hay.  Samples 
taken  when  the  hay  was  weighed  showed  that  it  contained  20  per  cent  moisture. 
Express  the  true  yield  in  tens  per  acre  on  a  15  per  cent  moisture  basis. 

Head  counts  were  made  on  a  number  of  fields  in  a  township  as  follows: 

FIELD NO.  HEADS  COUNTED     NO.  HEADS  SMUTTED 

1  50        "  ^  ............  ... 

2  1000  1 

3  100  1 
h                                                  500       .             15 

5  i+oo  20 

6  300  h 

7  1000  10 

8  600  12 

9  200  6 

1Q 10000 __50 

What  percent  smut  may  be  expected  in  the  wheat  delivered  to  the  elevator  from 
this  township? 
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3.  These  data  were  taken  from  several  fields  to  determine  the  probable  losses  from 
smut  for  a  community: 

FIELD PCT.  SMUTTED  (X) SIZE  OF  FIELD  (f) 

(Uo.)  (Heads)  (Acres) 

1  1.0  100 

2  15.0  20 

3  0.5  2:io 

k  20.0  10 

5  0.0  500 

6  o.s  500 

7  3.0  50 
3  2.5  125 
9  0.1  225 

10  5.0  150 

What  percent  smut  may  be  expected? 

k.   Seme  Iowa  data  were  collected  to  determine  the  relation  of  certain  ear  characters 
in  corn.  The  yields  from  the  very  short  ears,  when  used  for  seed,  were  as  fol- 
lows: 

Year No.  ears  Used Yleld-Bu.  per  acre 

191?  2l|-  1*2 .70 

1918 ; 2 .. 26.33 

Determine  the  average  yield  for  the  very  shoit  ears  for  the  3-year  period. 

5.  The  table  that  follows  gives  the  heights  of  plants  of  buckwheat  in  a  study  of 
variation  at  Cornell  University.  Plot  the  frequency  curve  on  cross-section  paper, 

Height  in 

Centimeters    25  35  ^5  55  65  75  85  95  105  115  125  135  1^5  155 

Number  of 

Plants         2   2   3   5  10  12  60  99  ikk       85   65   18    2    1 
Total  508 

Does  this  seem  to  approximate  a  normal  curve?  What  can  be  said  as  to  the  posi- 
tion of  the  mean  in  a  normal  curve?  The  mode?  The  median?  Is  a  normal  curve 
symmetrical?  Is  a  symmetrical  curve  necessarily  normal? 

6.  The  table  that  follows  gives  the  average  yield  of  wheat  per  plant  in  certain 
studies  at  Cornell  University.  Plot  the  frequency  curve  a3  in  the  previous 
example . 
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YIELD  PER  PLANT NUMBER  PLANTS 

( grams ) 

0.5  57 

1.5  59 

2.5  88 

5.5  kl 

1+.5  ^5 

5.5  29 

6.5  26 

7.5  5 

8.5  8 

9.5  6 

10.5  '             8 

11.5  '             5 

12.5  5 

13-5  1 

1U.5  1 

15.5  2 

16.5  2 

17.5  g_ 

Total    366 

In  what  respect  does  it  differ  from,  that  of  the  previous  example.  What  name  is 
given  to  frequency  curves  of  this  kind?  Do  the  mean,  median,  and  mode  coincide 
in  this  curve'? 

7.  The  number  of  stalks  were  measured  on  two  different  kinds  of  Colsess  barley 

plants  grown  in  1930  at  the  Colorado  Experiment  Station.   One  kind  was  a  normal 
green  (AcAc)  and  the  other  heterozygous  for  a  lethal  factor  (Acac) .  Plot  the 
frequency  curves.  Does  the  lethal  seem  to  be  detrimental  to  growth? 

No.  Stalks  Heterozygous  Plants  Green  Plants 

per  Plant  (Frequency)  (Frequency) 

1  5  7 

2  11+  9 

3  51  28 
k                                                               62  33 

5  63  31 

6  1+1  19 

7  21  12 

8  12  k 

9  5  1 

10  1  0 

11  0  1 

12  0  0 

13  1  0 


Totals  27"S~  lk-j 

(Note:  Calculate  the  frequencies  of  the  green- plants  on   a  basis  of  N  =  276  in 
order  to  make  the  two  sets  of  data  readily  comparable.) 

Some  data  were  collected  by  Emerson  (1913)  for  the  study  of  size  inheritance  in 
corn.   Classify  the  data  for  hybrid  60  x  rjk .     Prepare  a  frequency  table  for 
these  data  and  calculate  the  mean  of  the  sample  using  a  guess  mean.  Continue  and 
find  the  standard  deviation  (s')  ana  the  coefficient  of  variability.  The  measure- 
ments are  given  as  lengths  of  ears  in  centimeters: 
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Hybrid  60   x  jk 


15  13  10  12  13  10  13  15'  11  10 

10  13  15  12  13  Ik  lk  lk  11  10 
13  12  11  12  11  12  10  13  Ik  12 

11  11  Ik  10  9  10  11  13  13  1^ 

12  11  10  Ik  11  13  12  13  13  10 
11  12  12  11  13  12  10  13  12  10 
11  13  Ik  13  12  15  1*+  12  13 

9.  Calculate  the  standard  deviations  (sv)  for  height  of  plants  in  problem  p,  using 
(a)  deviations  from  a  true  mean  and  (h)  deviations  from  a  guess  mean. 

10.  Some  1930  data  on  black  hulless  "barley  plants  were  compiled  by  the  Colorado 
Experiment  Station  to  determine  the  variation  in  number  of  kernels  per  plant. 
The  data  are  grouped  in  classes,   (a)  List  the  class  boundaries,  and  calculate 
the  mean,  standard  deviation,  and  coefficient  of  variability,   (b)  Apply 
Sheppard's  correction  to  the  standard  deviation,   (c)  Is  the  number  of  group 
classes  sufficient  according  to  Yule's  formula?  Calculate. 


x  (class  center)   15  ^5  75  105  135  165  195  225  255  285  315 

f  (frequency)       2  12  11   26   38   26   18   13   lj    3    1  =  I63 


Note  that  the  origin  is  taken  at  the  class  center  below  15 . 


CHAPTER  VI 
TESTS  OF  SIGNIFICANCE 

I.  Statistics  as  a  Basis  for  Generalization 

So  far,  the  discussion  has  dealt  with, a  sample  and  its  statistical  description.  The 
investigator  may  desire  to  apply  the  information  collected  from  the  samples  to  de- 
scribe the  general  population,.  Before  he  can  do  that,  he  must  take  into  considera- 
tion the  chance  or  random  errors  introduced  in  the  actual  taking  of  the  sample. 
Chance  errors  result  from  the  operation  of  a  great  many  factors,  none  of  which  is 
dominant,  and  all  of  which  are  relatively  similar,  equal,  and  independent.  When  only 
chance  errors  operate,  the  data  are  said  to  be  random  and  follow  the  law  of  gr'eat 
numbers. 

Two  kinds  of  error  exist,  chance  and  systematic.  Errors  due  to  chance  may  not  "be 
entirely  eliminated  but  can  be  submitted  to' mathematical  treatment.  Systematic 
errors  can  be  largely  eliminated  when  an  experiment  is  properly  planned. 

II.  Theory  of  Probability 

In  the  analysis  of  chalice  errors,  it  is  necessary  to  introduce  some  of  the  fundamen- 
tal concepts  of  mathematical  probability. 

(a)  Single  Probabilities 

The  probability  of  the  occurrence  of  an  event  can  be  defined  from  two  view- 
points . 

(1)  Mathematical  Probability:  The  mathematical  or  a  priori  probability  of  an  event 
is  the  ratio  of  the  number  of  ways  the  event  may  occur  to  the  total  number  of  ways 
it  may  either  occur  or  fail  to  occur,  assuming  all  such  ways  are  equally  likely. 
Thus,  the  probability  of  drawing  any  individual  card  from  an  ordinary  deck  is  l/52, 
while  that  of  drawing  any  card  of  a  given  suit  is  13/i>2  or  l/U.  Probabilities  are 
sometimes  stated  in  terms  of  odds,  e.g.;  suppose  the  probability  of  the  occurrence 
of  an  event  is  l/2p.  The  odds  are  1:24  in  favor  of  its  occurrence,  or  24:1  against 
its  occurrence.  To  be  more  explicit,  the  occurrence  of  the  event  is  expected  just 
once  in  25  trials. 

(2)  Statistical  Probability:   Suppose  an  experiment  is  repeated  a  great  number  of 
times.  When  it  terminates  in  a  particular  manner  a  certain  number  of  times,  the 
ratio  of  this  latter  number  to  the  total  number  of  trials  defines  an  estimate  of  the 
probability  of  the  particular  termination.  Suppose  N-  and  N  represent  the  number  of 
successes  and  the  number  of  trials  (both  successes  and  failures),  respectively,  then 

Limit    _N_}_  will  be  defined  as  the  probability 

of  a  success,  Thus  this  probability  can  be  approached  but  never  attained  in  practi- 
cal work  with  infinite  populations.  The  permanency  of  the  value  N  /N  for  N  large  is 
the  law  of  great  numbers.  This  permanency  results  from  randomness  in  the  experimen- 
tal trials  and  is  the  necessary  property  that  statistical  data  must  possess  to  admit 
valid  treatment  by  mathematics.  As  an  illustration. of  statistical  probability,  in  a 
frequency  distribution,  any  particular  class  frequency  divided  by  the  total  number  of 
observations  in  the  distribution  gives  an  estimate  of  the  probability  that  any  indi- 
vidual observation  made  at  random  will  fall  in  that  particular  class. 

It  is  evident  from  either  definition  that  the  probability  of  the  occurrence  of  an 
event  may  vary  between  zero  (0),  i.e.,  certainty  that  the  event  will  not  happen,  and 


55 


one  (1),  i.e.,  certainty  that  the  event  will  happen. 

(b)  Several 'Probabilities 

When  several  probabilities  are  to  be  dealt  with  simultaneously,  it  becomes 
necessary  to  consider  two  fundamental  theorems. 

(1)  Theorem  IY  When  a  number  of  mutually  exclusive  events  have  certain  probabilities 
of  occurrence,  the  probability  of  occurrence  of  some  one  or  other  of  these  events, 

is  the  sum' of  their  individual  probabilities.  For  example,  tho "probability  that  an 
observation  in  the  barley  yield  data  (Chapter  3,   pages  39  and ^0)  will  fall  in  class 
x  =  88  is  12 /lj-00,  while  the  probability  that  one  will  fall  in  class  x  =  107  is  3lA00» 
The  probability  that  an  observation  will  fall  in  either  class  88  or  107  is  12/U00  ■*■ 
3lA00  =  U3/U0O,  i.e.,  P  =  0.11. 

(2)  Theorem  II:  When  a  number  of  independent  events  have  certain  probabilities  of 
occurrence,  the  probability  of  all  occurring  together  is  the  product  of  thoir  individ- 
ual probabilities.  In  the  above  example,  the  probability  that  the  first  and  second 
observations  will  fall  in  classes  x  =  88  and  x  =  107,  respectively  is  12/k-OO  times 
31A00  =  372/160,000,  i.e.,  P  =  0.0023. 

A  --  Large  Sample  Theory 

III.  Probability  and  the  Normal  Curve 

Statistical  data  that  possess  the  property  of  randomness  often  are  distributed  in  a 
manner  closely  expressed  by  a  normal  distribution.  Many  of  the  sample  statistics  of 
large  samples  can  be  mathematically  proved  to  have  distributions  extremely  close  to 
normal.  Therefore,  the  application  of  probability  to  the  normal  curve  is  important 
in  practical  work.  The  area  below  the  curve  is  taken  as  one  unit .  Hence,  the  area 
between  any  two  ordinates  may  be  considered  as  the  probability  that  an  individual 
observation  will  fall  within  the  range  defined  by  the  two  ordinates.  Wow,  by  theorem 
I,  tho  probability  that  an  individual  observation  will  fall  within  any  range,  is  the 
sum  of  the  probabilities  that  it  will  fall  in  all  sub-divisions  of  that  range. 

Thus,  for  characters  which  are  distributed  normally,  it  is  possible  to  estimate  the 
probabilities  of  their  occurrence  in  any  given  range.  This  is  done  by  finding .the 
areas  beneath  the  normal  probability  curve  t,hat  correspond  to  the  given  range.  Math- 
ematical tables  of  such  areas,  called  probability  integral  tables,  have  been  con- 
structed.  (See  Table  I  in  appendix). 

Some  of  the  most  important  probabilities  and  ranges  are  given  below  with  the  aid  of 
a  figure. 


t  = 
-3a 


(*  =  - 


-2cx   -lo-^-P.Ey  ^P.|.4-lcr    +'dcr 
0.67^5  and  -<-  0.67^3,  respectively) 
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In  the  case  of  a  normally  distributed  variable,  it  is  clear  that:  the  probability  that 
an  individual  observation  will  fall  within  a  range  of  a  on  either  side  of  the  mean 
(x)  is  approximately  0.68;  within  a  range  of  2  er  it  is  approximately  0.95;  while  with- 
in a  range  of  3  cr  it  is  0.997-  Thus,  the  probability  that  the  observation  will  dif- 
fer from  x  by  at  least  2  <r  is  1.00  -.  0.95  or  about  0.05.   In  other  words,  the  chances 
that  an  individual  will  fall  outside  a  range  of  2  o- are  approximately  5:95  or  1:19- 
This  means  that  such  a  situation  may  be  expected  about  once  in  20  times  due  to  chance 
alone.  In  like  manner,  the  probability  that  the  observation  will  differ  from  x  by  at 
least  5  cr  is  1.000  -  0.997  or  0.003.  Such  a  result,  then,  may  be  expected  to  happen 
only  once  in  333  times.  Therefore,  when  an  observation  differs  from  the  mean  by 
"too  much"  there  arises  the  important  question  as  to  whether  or  not  this  abnormal  re- 
sult might  not  be  due  to  some  special  cause  acting  in  the  case  of  this  individual. 
When  some  special  affecting  condition  is  known  to  exist,  common  sense  leads  one  to 
the  conclusion  that  the  extreme  abnormality  of  the  observation  is  more  likely  due  to 
the  affecting  condition  than  to  be  expected  on  the  basis  of  probability. 

TV.  levels  of  Significance 

What  constitutes  an  abnormality  which  is  "too  much"  is  a  matter  of  arbitrary  decision. 
Common  usage  in  this  country  considers  an  abnormality  of  twice  the  standard  deviation 
(standard  error  in  this  sense)  as  being  sufficient  to  warrant  the  statement  that  the 
abnormality  of  difference  from  the  mean  is  a  real  or  significant  difference.1  This 
does  not  mean  that  an  individual  observation  taker,  at  random  and  showing  a  signifi~ 
cant  difference  does  not  belong  to  the  general  population.  However,  in  such  a  case 
one  would  inquire  as  to  whether  the  individual  case  in  question  was  of  a  special 
nature,  either  inherently  or  by  reason  of  treatment.  Should  such  a  condition  be  sub- 
stantiated, it  is  quite  proper  to  attribute  the  abnormality  to  special  cause  or  con- 
dition and  not  to  chance. 

Some  workers  in  the  field  of  statistics  use  a  difference  of  3  cr  as  a  criterion  for  a 
significant  difference,  This  allows  the  worker  to  place  more  confidence  in  a  con- 
clusion derived  from  a  "significant"  observation,  but  this  advantage  is  over-shadowed 
by  a  possible  tremendous  loss  cf  information  due  to  the  imposition  of  a  too  stringent 
criterion. 

V.  Different  Kinds  of  Probability  Tables  ' 

There  are  two  kinds  of  probability  tables,  viz.,  one-way  and  two-way  tables.  The  use 
of  a  particular  one  depends  upon  the  nature  of  the  statistical  hypothesis  to  be  test- 
ed. The  results  obtained  in  one  can  be  readily  explained  in  terms  of  the  other  (See 
Livermore,  1931' )  • 

(a)  One-Way  Tables 

The  principal  one-way  table  for  normal  curve  areas  is  that  devised  by 
Sheppard  and  published  by  Karl  Pearson  (191*4-)  as  Table  II.  Suppose  an  ordinate  is 
erected  at  a  distance  on  the  positive  side  of  the  mean,  exactly  twice  the  standard 
deviation  (.cr) ,  Thus  t  or  d/a  =  2.  From  Table  I  (appendix),  it  is  found  that  the 
area  (A)  that  corresponds  to  t  (or  d/a)  =  2  is  0.9772,  or  the  area  defined  by  the 
interval  from  minus  infinity  to  the  assigned  value  of  t  (d/a).  Thus,  with  the  total 
area  beneath  the  curve  considered  as  1.0,  the  area'  to  the  left  of  the  ordinate  is 
0.9772  while  that  to  the  right  is  1.0000  -  O.0772  =  0.0228.  Thus,  P  =  0.0228 
(about  l/kk)    is  the  probability  that  a  value  taken  at  random,  will  exceed  the  mean 
(in  one  direction  only)  by  an  amount  equal  to  2  or  more  times  the  standard  deviation 
(a). 


l-This  approximates  3  times  the  probable  error. 
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Sometimes  probabilities  are  expressed  as  odds: 


Area  inside  the  ordinates  divided  "by  area  outside  the  or di nates  is  equal  to  the  odds 
against  the  occurrence  of  a  deviation  as  great  or  greater  than  the  designed  one  due 
to  chance  alone.  In  the  above  example,  0. 9772/0.022 8  =  kjil   (approximately) 

In  this  case  the  odds  are  k$:l   that  a  value  will  not  exceed  the  mean  to  the  extent 
of  two  or  more  times  the  standard  deviation  due  to  chance  alone.  Table  I  (appendix) 
is  a  one --way  table. 

(b)  Two-way  Tables 

Suppose  one  inquires  as  to  the  probability  of  selecting  a  variate  at  random 
so  that  it  shall  fall  outside  the  limits  of  plus  or  minus  twice  the  standard  devia- 
tion. Two  ordinates  are  erected,  one  at  t  or  d/a  =  -2  and  one  at  t  =  +  2.  The 
problem  is  to  find  the  area  in  both  tails  of  the  curve.  This  will  be  (1.0000  - 
0.9772)  times  2  =  0.0k^6,   or  double  that  in  the  one-way  table.  This  means  that  the 
probability  that  a  single  variate  selected  at  random  will  deviate  by  an  amount  equal 
to  or,  greater  than  +  2a  is  O.O^b,  or  approximately  l/22. 

The  values  on  a  two-way  basis  can  be  expressed  as  odds  as  follows: 
0.954^/0.0^5^  =  21:1.  Thus,  the  odds  are  21:1  against  the  occurrence  of  a  deviation 
as'  great  or  greater  than  the  designated  one  (plus  or  minus  twice  the  standard  devia- 
tion) due  to  chance  alone.  A  typical  two-way  table  for  large  samples  is  Table  IV 
given  by  Davenport  (193&). 

In  summary,  it  should  be  clear  that  the  one-way  interpretation  or  the  use  of 
a  one-way  table  gives  the  probability  or  odds  that  an  obtained  value  shows  a  certain 
discrepancy  from  the  mean  in  a  stated  direction  whereas  the  two-way  interpretation 
does  not  state  the  direction  which  the  discrepancy  must  take  in  a  statement  of  proba- 
bility or  odds . 

( c )  Transformation  of  Values  -  ■ 

The  probability  values  obtained  in  one  type  of  table. can  be  readily  trans- 
formed into  terms  of  the  other  to.  meet  the  experimental  argument  at  hand.  Probabili- 
ty values  in  a  one-way  table  can.be  doubled  to  give  the  results  obtained  from  a  two- 
way  table,  and  vice  versa. 

The  transformation  of  odds  is  as  follows: 

Odds  in  two-way  table  =  odds  in  one-way  table  -  1 

2 
Odds  in  one-way  table  =  I  (odds  in  two-way  table)  (2)  4-  1.  I 

'  VI*  Standard  Errors  of  Statistical  Constants 

Each  statistical  constant  or  estimate  has  its  own  standard  error.  The  standard 
error  of  a  statistic  derived  from  a  sample  is  the  standard  deviation  of  the  distri- 
bution of  that  statistic  thought  of  as  resulting  from  many  samples.  The  distribu- 
tions of  many  statistics  are  nearly  normal,  particularly  when  the  basic  sample  is 
large. 

(a)  Standard  Error  of  a  Single  Observation 

The  "best"-1-  estimate  of  the  standard  deviation  of  a  single  observation  (cr) 
is  the  standard  error  (s)  derived  from  the  sample.  Some  data  on  the  total  weight  of 


^-Note:  The  best  unbiased  estimate  is  simply  called  •"best''.  See  more  advanced  treat 
ments  of  mathematical  statistics. 
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grain  in  grams  for  -non -competitive  Colsess  barley  plants  as  follows: 


Class  Center   1   3   5   7   9   11   13   15   17   19  21  23  25  27  29  31  33 
Frequency      3  11  21  35  kj       55   7I   52   k'J       35  21  10   9  11   5   1   1 

N  =  ta  x  =  13.8  Sfd2  =  l^lG.bh 


s  =  standard  error  of  a  single  observation  =  /Sf ic  (l) 

■V  W-l 
In  the  above  example,  it  would  be  calculated  as  follows : 


0  =   /  ^ j j-  _~y-  -  6.003  grams 

This  value,  s  =  6.003  grams,  is  the  standard  error  of  a  single  Variate  in  this  sam- 
ple. For  instance,  the  value  of  the  mean,  13.8'+  2  (6,003)  indicates  that  the  odds 
are  21:1  that  a  single  individual  taken  at  random  will  not  deviate  from  the  mean  by 
more  man  2  a  in  either  direction,  where  the  normal  distribution  of  the  population 
affording  the  sample  data  is  assumed.  NV 

(b)  Standard  Error  of  the  Mean 

Suppose  a  second  sample  were  taken.  One  could  hardly  expect  to  get  exactly 
the  same  result  for  the  mean  (x)  as  in  the  sample  in  question.  Thus,  the  mean  (x) 
obtained  from  a  single  sample  is  merely  an  estimate  of  the  true  mean  (m)  of  the  whole 
population.  The  latter  is  unknown  and  necessarily  must  remain  so.   In  case  it  were 
possible  and  practical  to  take  and  analyze  a  greater  number  of  samples,  finding  the 
mean  (x)  for  each,  one  would  expect  the  mean  of  all  the  sample  means  to  be  very  close, 
indeed,  to  the  mean  of  the  population  (m)  .  Since  this  is  not  feasible,  one  can  only 
ask  how  good  an  estimate  of  the  population  mean  (m)  is  the  mean  (x)  computed  from  a 
single  sample.  The  answer  to  this  question  can  only  be  given  in  terms  of  probabili- 
ties.  It  can  be  shown  mathematically  that  the  mean  computed  from  a  large  number  of 
large  samples  are  distributed  nearly  normally  with  standard  deviation,  ox, ,  which 
is  theoretically  equal  to  the  ratio  of  the  standard  deviation  of  the  population  to 
•JW,   the  number  of  observations  that  make  up  the  sample*?  However,  the  standard  devia- 
tion of  the  population  (o)  is  unknown  and  in  its  stead  its  estimate  (s)  derived  from 
the  sample  is  used.  Therefore  the  standard  deviation  of  the  hypothetical  distribu- 
tion of  means  of  a  large  number  of  samples  will  be  estimated  as  follows: 

c-  =  standard  error  of  the  mean  ~  s    • (2) 


x 


/¥ 


The  greater  the  number  of  observations  in  the  sample,  the  smaller  will  be  the  stand- 
ard errors  of  the  various  statistical,  constants.  Hence,  the  statistical  constants 
derived  from  a  large  sample  are  more  likely  to  represent  the  true  constants  of  the 
general  population  than  those  derived  from  a  small  sample.  When  the  sample  is  small, 
the  argument  is  the  same  except  that  the  distribution  for  x  deviates  from  normality 
and  needs  special  interpretation. 


V^It  should  be  noted  that  s  =   /Sfd   ,  which  best  estimates  the  standard  deviation 

-V¥-i  '       _. 
of  the  population,  closely  approximates  s'  =  /Sfd-  which  is  the  standard  deviation 

V  N 
of  the  sample,  and  is  the  estimate  of  c- given  by  the  maximum  likelihood  principle. 

villi  g  is  sometimes  expressed  as  ''S  »E .  of  the  mean." 
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In  the  above  example  the  standard  error  of  the  mean  {°%)   is: 

o~x  =  _S_  -     6.005  =  O.296  grams. 

-/n  -Jku 

The  mean  is  I3.8  +  0.296  grams.  Therefore,  the  odds  are  21:1  that  x  (I3.8  grams) 
dees  not  differ  from  the  unknown  true  mean  (m)  of  the  general  population  by  more 
than  2o-j,  or  2(0.296)  a  0.592  grams. 

(c)  Standard  Error  of  the  Standard  Deviation 

Next,  it  is  desired  to  discover  how  reliably  the  standard  deviation  of  a 
single  sample  (s1)  estimates  the  unknown  standard  deviation  of  the  population  (cr). 
Mathematically,  it  has  been  found  that  the  best  estimate  of  a  hypothetical  distribu- 
tion of  standard  deviations  derived  from  a  great  number  of  samples  is  as  follows: 


0      =  standard  error  of  standard  deviation  =  _j3_  (approximately)  (3) 

o"  f~ — 

y2N 

From  the  example  used  above, 


acr    =         zM=         "    6-QQ3-    b  0.209  grams 

Therefore,  the  odds  are  20:1  that  s  =  6. 003  grams  does  not  differ  from  the  unknown 
true  standard  deviation  of  the  general  population  (c*)  by  more  than  2  a  or  2  ( 0.209) 
=  0.4l8  grams. 

(d)  Standard  Error  of  the  Coefficient  of  Variability 

By  use  of  the  same  argument,  the  standard  error  of  the  coefficient  of  varia- 
bility is: 

°"  =     C.  V.     ^1+2  /C.VA2     2     when  C.  V.    is        - (k) 

C'V*        /~2N        ■-  \100  '  J         greater  than  10 

°>.  «•     =     c'  V»       vnen  c-v-   is  less  than  10 - (F0 

c .v .    .   .  - 

-/2N 
In  the  example  used., 


a-     =  *3-5 
c.v. 


-M2fi4.nl 


1  +2  fk^2 
100 


2   =  I.78I 


A  table  has  been  worked  out  by  Brown  (193*0  to  shorten  the  computation  necessary  to 
secure  the  standard  error  of  the  coefficient  of  variability  when  C.V.  is  greater  than 
10. 

(e)  Standard  Error  of  an  Average  of  Averages 

The  standard  error  of  an  average  of  averages  is  given  by  the  formula: 


°a  »  1 


/*J"%     *  °%         ^-- *\   ■ •••<6> 

where  N  equals  the  number  of  separate  means  and  a  £   a  5^,  etc.,  represent  their 
separate  standard  errors.  c- 

(f )  Standard  Error  of  a  Difference 

Suppose  two  samples  are  measured  with  respect  to  a  common  character.  From 
the  data,  let  two  similar  statistical  constants  be  compared,  e.g.,  the  two  means  or 
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the  two  standard  deviations.  The  question  arises  as  to  whether  the  two  constants, 
differ  significantly.   Its  answer  depends  upon  the  standard,  error  of  the  difference 
which  is  as  follows: 


1 


°a  =     V(j2l    +     °"22       -  /  a~i     *     eC2         (7) 


where  s-j  and  So  are  the  standard  errors  of  the  two  like  statistical  constants  derived 
from  the  two  samples.  Where  a  significant  difference  results  In  the  case  of  two  sam- 
ples drawn  from  the  same  population,  it  would  Indicate  probable  improper  sampling 
technique  leading  to  lack  of  randomness.'-  The  principal'  use  of  this  method  lies  In 
its  test  as  to  whether  or  not  a  fa/ctor  known  to  exist  in  the  case  of  one  sample,  and 
not  in  the  other,  is  really  a  causal  factor  to  which  an  -abnormal  difference  can  he 
attributed,  e.g.,  the  difference  between  two  yields  in  a  yield  triad. 

For  example,  suppose  it  Is  desired  to  determine  the  standard  errors  of  the  difference 
of  the  mean  yields  of  Kanred  and  Turkey  wheats,  and  also  for  Manchuria  and  Minnesota 
hk^   barleys. 

(1)  Wheat  Variety  Yield  (Bu.J  (2)  Barley  Variety  Yield  (Bu.) 

Kanred        25  +  0.7  Manchuria     38*9  +  0*9 

Turkey  2k     +  0.6  Minnesota  kk-5  1+8.5  +     1-2 


Difference      1  -4-  0.92  Difference     9.6  +  1.5 

l7d  =  7(°-T)2  +  (0.6)2  =  0.92  orl  =  VTo.9)2  +  (I-?-)2  -  1..5 

VII.  Significant  Differences 

After  a  difference  Is  obtained  for  two  statistical  constants,  as  in  the-  above  example, 
it  is  desirable  to  test  this  difference  for  statistical  significance.  An  investigator 
may  arbitrarily  choose  whatever  level  of  significance  he  desires,  but  should  state  the 
level  chosen.  He  must  use  care  In  attributing  differences  to  causal  factors  when  the 
differences  approach  the  level  of  significance  that  he  has  chosen.  To  determine  the 
significance  of  two  statistical  constants,  their  difference  divided  by  the  standard 
error  of  the  difference  (ctcl)Is  commonly  employed.  For  example,  in  the  case  of  Kan- 
red and  Turkey  wheats  cited  above, 

t  =  d_  =  JL_ =  1.09 

og    O.92 

When  the  level  of  significance  la  taken  as  d/c&  -  2,  this  difference  is  not  signifi- 
cant.  On  the  basis  of  probability,  the  odd.s  are  a  little  more  than  2:1  that  this 
difference  is  a  real  or  significant  difference.  Hence,  it  may  be  ascribed  to  chance, 
or  to  put  it  in  another  way,  one  would  not  claim  superiority  for  Kanred  because  the 
probability  is  too  large  that  such  a  statement  is  incorrect. 

In  a  comparison  of  Manchuria  and  Minnesota  hk^   barley, 


t ■  =jl  =  _9._6  *  6  .-'-!- 


Hlhen  Oj   =  op,  ad  =  qJ2,  .... 

^Relation  (7)  holds  strictly  only  where  the  two  variable  statistics  are  normally  dis- 
tributed and  derived  from  uncorrelated  data. 
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Since  t  =  d/oft  is  far  greater  than  2,  the  difference  in  yield  between  the  two  varie- 
ties is  said  to  be  real  and  not  due  to  chance .  In  this  case  the  claim  is  made  that 
Minnesota  Mt-5  is  superior  to  Manchuria  in  yield  ability.  The  probability  of  the 
incorrectness  of  this  statement  is  insignificantly  small.  It  would  be  a  miracle  if 
this  claim  were  really  incorrect. 


VIII.  Probable  Errors. 

The  quantity  O.6745o~,  which  gives  the  range  that  contains  half  the  observations,  is 
termed  the  probable  error.  For  example,  an  average  yield  of  15.0  +1.5  bushels  would 
mean  that  the  chances  are  50:50  that  the  true  value  of  the  average  for  an  infinite 
population  lies  between  13.5  and  I6.5  bushels.  It  also  indicates  that  the  chances 
are  even  that  it  may  lie  outside  this  range. 

(a)  Use  of  Probable  Errors 

Historically,  the  probable  error  was  used  before  the  standard  error.  It  is 
still  widely  used  in  this  country  in  the  statistical  treatment  of  biological  data, 
but  the  tendency  is  to  use  the  standard  error  and  think  in  terms  of  it.  It  is  gen- 
erally felt  that  "the  probable  error  is  an  unmitigated  nuisance,"  and  has  nothing  to 
recommend  except  its  previous  usage. 

(b)  Formulae  for  Probable  Errors 

The  probable  error  is  approximately  two-thirds  of  the  standard  error.  It  can 
be  obtained  when  each  standard  error  value  is  multiplied  by  0.67^5.  These  formulae 
may  be  briefly  summarized: 


(1)  P.E.  single  determination  =  +  O.67U5 


(8) 


(2)  P.E.-  =  0.67^5  8'   or  +  0. 67*15  s 


(9) 


-/  N  -  1 


/r 


(3)  P.E._  =  0.67^5  s' (10) 


cr 


-J     2N 


(h)     P.E.      =  +  Q.67^5  C.V. 
c.v.     -     L_y 

V2N 


1+2  jC.V.) 2 
1 100 


(11) 


where  C.V.  is  greater  than  10. 

(5)  P.E.C#V>  =  +  0.6745  C.V.    (12) 

■J     2N 
where  C.V,  is  less  than  10. 

(c)  Levels  of  Significance  for  Probable  Errors 

The  level  for  significance  for  the  probable  error  is  commonly  taken  as 
D./P.E.^  =  3'  This  is  equivalent  to  odds  of  about  22:1.  Some  workers  use  3-2  times 
the  probable  error,  for  which  the  odds  are  approximately  30:1.  A  table  of  odds  for 
probable  errors  is  given  by  Hayes  and  Garber  (1927)  in  "Breeding  Crop  Plants." 


(d)  Relation  of  Standard  Errors  to  Probable  Errors 

Based  on  the  normal  curve  the  quart ile  lines,  Q2  and  Qu, 
error  of  a  single  variate,  or  Q  =  O.67J+5  cr. 


give  the  probable 
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The  intervals 

£  +1  o* 

x  +  2  a 


X   +  p  CT 


include 


>8.3  ?  of  variates 


9E>.5 


QQ.7  "   " 


99-7 


x  +  l  P.E. 

x  +  2  P.E. 
i  +  3  P.E. 


Include 


50.0  'jo   of  var later, 
95.7  "  " 


B  --  Special  Case  of  Small  Samples 


IX .  Use  of  Small  Samples  in  Biological  Research 

The  methods  heretofore  explained  relate  to  the  determination  of  significance  based  on 
the  normal  distribution  for  large  samples,  but  it  is  not  always  possible  to  obtain 
large  samples.  This  is  often  the  case  in  agricultural  or  biological  experiments. 
When  the  investigator  can  be  certain  that  the  populations  which  afford,  small  samples 
approximate  the  normal  distribution  in  form.,  he  may  feel  that  the  interpretation  of 
the  statistical  analysis  was  valid.  Therefore,  the  materia],  that  follows  is  given  on 
the  basis  of  small  populations  whose  distributions  approach  that  of  the  normal  curve. 
Statistical  treatment  of  small  samples,  from  populations  far  from  normal  in  distri- 
bution, may  probably  be  inadequate.  Too  often  it  may  lead  to  Incorrect  conclusions. 
Statistical  analysis  of  a  single  sample  with  less  than  20  cases  is  hazardous.  In 
samples  of  20  to  100  cases,  the  hear -normality  of  the  under-lying  population  should 
be  known.  This  places  a  severe  limitation "on  the  use  of  small  samples,  but  fortunate- 
ly in  agricultural  and  biological  experiments,  most  of  the  populations  with  which  the 
experimenter  deals,  are  near  normal.  The  importance  of  the  small  sample,  together 
with  its  statistical  treatment,  has  been  discussed  by  Fisher  (1931!-)- 

X.  Degrees  of  Freedom 


The  reliability  of  a  statistic  (estimate  of  a  population  parameter)  will  obviously 
depend  upon  the  number  of  variates  in  the  sample.  This  dependence  is  also  affected 
by  the  number  of  restrictions  placed  on  the  aggregate  observations  in  the  determina- 
tion of  an  estimate  of  a  population  parameter.  The  total  number  of  observations 
diminished  by  the  number  of  restrictions  which  they  in  aggregate  must  submit  to  has 
been  termed  "degrees  of  freedom"  by  Fisher  (193*0  • 

It  has  been  stated  that  the  best  estimate  of  the  variance  of  a  population  a3  derived 
from  the  sample  is  as  fellows: 


S(x  -  x) 


2 


(13) 


In  this  case,  the  number  of  individual  observations  (N)  Is  diminished  by  one  to  give 
the  degrees  of  freedom.  The  number  of  statistical  constants  of  the  sample  which  are 
directly  used  in  the  computation  arc  subtracted.  The  mean  or  total  fixes  one  value 
in  the  above  formula,  so  that  only  IJ-1  observations  are  free  to  vary.  This  Is  of 
little  importance  when  a  large  sample  Is  analyzed,  but  very  important  in  small  sam- 
ples. 

XI .  Probability  Determinations  with  Small  Samples 

The  distribution  of  x I  _§_  is  not  sufficiently  close  to  normal  for  small  samples.  The 

nature  of  the  distribution  of  x/s'  was  found  by  "Student"  in  I90S.  He  prepared  a 
series  of  tables  based  on  the  distribution  of  s'  (whore  s'  =  V  S  ( x  -  S^/N)  which  he 
designated  as  "Z" .  He  showed  that  the  "Z"  distribution,  now  more  commonly  called 
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Student's  distribution,  was  the  same  as  the  Pearson  Type  III  curve.  More  recently, 
he  has  prepared  tables  for  the  distribution  of  "t"  which  is  designated  as  x/oj  or 
x ->/n78  by  Fisher  (193*0  •  For  a  given  value  of  "t"  that  corresponds  to  a  given  number 
of  degrees  of  freedom  one  can  read  the  probability  in  an  analogous  manner  to  the  -way 
the  tables  of  areas  of  the  normal  curve  are  used. 

The  "t"  table  devised  by  Fisher  (193*0  is  a  two-way  table.  A  probability  of  0.05  is 
Fisher's  5  per  cent  point  for  which  the  odds  are  19:1;  a  probability  of  0.01  is  the 
one  per  cent  point  for  which  the  odds  are  99:1 .  In  addition  several  one-way  tables 
are  in  use.  These  are  as  fellows:   (1)  Student's  "t'!  table,  (2)  Livermore's  modifi- 
cation of  Student's  "t",  (3)  Student's  "Z",  and  (k)   Love's  modification  of  "Z".  For 
example,  suppose  t  =  U.60^  for  k   degrees  of  freedom.  The  probability  as  found  in  a 
one-way  table  is  equal  to  0.995.  The  calculated  odds  would  be  199:1.  They  are  cal- 
culated as  follows:   1-P  =  1.000  -  0.995  *  0-QP5.   5/1000  =  l/200.  P  =  1/200  is 
equivalent  to  odds  of  199:1- 

XII.  Significance  of  Means 

When  d  is  the  difference  between  the  mean  of  the  sample  and  any  value  (m1)  assumed  to 
be  the  mean  (m)  of  the  population,  it  has  been  stated  that  the  difference,  d  =  x  -   m', 
is  significant  when  d/a  x  exceeds  2.  When  this  occurs,  the  hypothesis  (m  .-=  m')  is 
rejected.  This  procedure  holds  when  d/o  g  is  nearly  normally  distributed  as  in  large 
samples.  As  this  distribution  is  not  close  to  normal  for  small  samples,  the  "t"  table 
should  be  used  in  such  cases.  When  the  5  per  cent  point  is  used  as  the  level  of  sig- 
nificance, a  value  of  t  =  d /<j£  that  corresponds  to  P  =  0.05  is  considered  as  signifi- 
cant. In  this  test  for  the  significance  of  the  mean  one  determines  the  probability 
of  drawing  a  sample  with  a  msan  equal  to  x  from  a  population  whose  true  mean  (m)  is 
assumed  to  be  some  particular  value  (m1). 

XIII.  Means  of  Two  Independent  Samples 

One  of  the  most  important  problems  in  statistics  is  to  test  the  significance  of  a 
difference  between  two  means,  i.e.,  2,  -  x2  =  d.  Previously,  it  has  been  stated 
that  the  standard  error  of  the  difference  of  the  means  of  two  samples  is  o^   = 
rX,  *  °xo*  Should-  there  be  any  reason  to  suspect  that  the  standard  deviations  of 

the  two  underlying  populations  are  different,  one  should  form  t/og_  with  o^  as  given 
h^re. 

(a)  Samples  with  Different  Numbers  of  Observations 

When  it  can  be  assumed  that  the  standard  deviations  of  the  populations  are 
the  same,  or  that  the  samples  have  been  drawn  from  the  same  population,  then  the 
best  estimate  (s)  of  the  population  standard  deviation  (cr)  is: 


s  =  /  D  (  Xt  -  x 


xx)2  *  S(x2  -  *2)2 - -  (13+) 


(1^  -  1)  +  (N2  -  1) 


Here  N^  and  N2  are  the  numbers  of  observations  in  the  two  samples  while  the  denomina- 
tor evidently  denotes  the  degrees  of  freedom.  This  method  to  determine  s  as  an  esti- 
mate of  o*  is  particularly  important  in  the  case  of  small  samples. 


6k 

The   "t"  value,    equivalent  to  d/s^    is   calculated  as  follows 

t  \/  =    xi   -  xo  r~    ~ — 


i :i_  ;  xm  -i     i^o 

V  NfTSa (15) 

(b)  Samples  with  Same  Number  of  Observations 

The  above  formulae  are  simplified  when  the  number  of  observations  are  the 
same  in  each  sample,  i.e.  N^  =  No. 

The  standard  error  (single  observation)  is  as  follows: 


/■ 


3  (x-j  -  x1)2  +  s(x?  -  x2)2   -------  _  -._-__.._  _  _  _  (16) 


-V  2-  (If  -  1) 

The  value   of   "t"   is  as   follows: 

i  ^2                       ------    -K    I , 

Some  data  presented  by  Imiuer  (19J6)  may  be  used  to  illustrate. the  computation.  Sin- 
gle plots  of  Velvet  and  Glabron  barley  were  grown  side  by  side  in  single  plots  on  12 
different  farms.     The  yields   in  bushels  per  acre  are   given  below: 

Farm  No.  Glabron  (x-j )                                        Velvet   (xo)                               Sum 

1  '+9 

2  1+7 

3  39 
k  37 

5  hG 

6  52 

7  51       .  • 

8  .      57 

9  *+? 

10  1+5 

11  1*8 

12  6'+   . 


1*2 

91 

J4-7 

oil- 

38 

77 

^2 

69 

111 

87 

lid 

93 

]'5 

06 

50... 

113 

1*2 

87 

39 

34 

^7 

95 

59 

105 

S    (x1)  =   58O  S(x2)    =   509                                          IO69 

*1  r"  ^8.3333  x2     =  1*2 "Ja67  x  ^  !^5.3750 

S(Xl2)     -  28,620  S(Xg)     =     21,979                            100,269 

(Sxr)2        =  28,033.31  '"  (Sx2)2  =;     21,590.10 


V  This  can  bo  readily  proved  as  follows 


r'-.T 


X]_    -  x2  =  x.^    -  x2  =  x-i   -  X2  =     x-j    -  Xg     I  N-^Ng 


2     •     2  2  2  ;  s  I   N 

-     -    So  „    WoS-,     -   IT,  So  BL     No    -  Nn  -,   I    1X1 


%       W2  I        *1*2  |/XW2 


where    "3"   is  an  estimate   derived  ~o'j  pooling  the   two   samples,    based  on  the  hypothesis 
that  the  two  populations  have  a  common  standard  deviation  (o). 
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The  computations  are  as  follows: 

.  [S   (Xl2)    -   (Sx^/Nj     +     [s   (x22)     -      (SX2)2/N2] 


2   (N-l) 


=     (28,620.00  -  28,055.51)   +  (21,979.00  -  21,590.10) 

22 

=       586.69  +  588.90    =     975-59    ■  ^.5^50 
22  22 

s     =     Jkk.3k50    =     6.6592 

t    =    xx  -  x2   [W  =    ^8.55  -  te.te     /if  =  2#1?69 

V2       6.6592     -V2 

The  "t"  table  is  entered  for  t  =  2.1769  for  2  (n-l)  =  22  degrees  of  freedom.  P  lies 
"between  0.05  and  0.02.   It  may.  he  concluded  that  the  odds  are  in  excess  of  19:1;  that 
the  difference  "between  the  mean  yield  of  these  two  varieties  is  not  due  to  chance. 

XIV.  Means  of  Paired  Samples 

In  this  case,  the  variahles  are  paired,  i.e.,  each  value  of  x^  is  associated  in  some 
logical  way  with  a  corresponding  value  of  x2.  As  a  result,  there  will  he  the  same 
number  of  variates  in  the  two  samples.  When  there  are  II  pairs  there  will  he  N-l 
degrees  of  freedom  available  for  the  comparison.  This  is  widely  known  as  Student's 
Pairing  Method. 

(a)  Student ' s  Pairing  Method 

This  method  is  devised  to  compare  two  results  on  a  probability  basis.  Ii,  is 
used  primarily  for  small  samples  it  not  being  necessary  to  assume  a  normal  population. 
Partial  mathematical  proof  of  the  method  was  first  published  by  Student  (V.S .Gossett) 
in  1908.  Differences  between  paired  values  are  dealt  with  directly,  with  the  result 
that  the  correlation  between  paired  values  is  taken  into  account.  The  method  was. 
brought  to  the  attention  of  American  agronomists  in  1925  by  Love,  et  al.   (1925*  192)+). 

The  variance  (s2)  and  "t"  values  are  calculated  as  follows: 

- -  (18) 


s2  =  variance  =  S(d2) 

-   (Sd)2/N 

/      . N 

-  1 

%  -a  /  HF 

- (19) 

Here  "t"  is  used  to  test  an  obtained,  value,  d,  in  accordance  with  the  hypothesis  that 
the  mean  of  the  population  of  differences  is  zero.  A  significant  result  would  mean 
the  rejection  of  the  hypothesis  and  would  warrant  a  statement  that  the  mean  of  one  of 
the  basic  populations  exceeded  that  of  the  other. 

(b)  Method  of  Computation 

The  method  of  computation  can  be  illustrated  from  the  Glabron  vs  Velvet  barley 
yields  mentioned  above.  The  computation  follows: 
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Farm  No.  Glabron  (xi  )  Velvet   (xp)  ■.  d  (Velvet 'from 
Glabron) 


1  k9  k2 

2  »47  lt-7 

3  3^  33 
k                                           •     37  32 

5  h6  hi 

6  52  |H 

7  51  *'.'  ^5 

8  57  56 

9  ^5  !+2 

10  1+5  59 

11  >+8  k'j 

12  64 


5V 


bum    t> 


(xx)     =       530  S(xp)     =     509 


Q 


7 

0 

1 

5 

R 

s 

11 

6 

1 

3 

0 

1 

25 

s(a)    = 

71 

a    - 

5.9167 

Mean  %  =         !R>.3333  3Eg     >     }42.>+l67 

(d2)     =     929  (Sdj2  /n     =  ^20.0857 

s2  =  S(d2)    -   (Sd)2/p     *     929.OOOO   -  U20.0857       =     V6. 263+9 
.N   -   1  11 

t  =  CW  /jE"    =  5.9167  /  lh'6.26^  ,=    5.9167.    /  /5T8555   =    3.0133 

The  value  of  "t"  ia  then  looked  up  in  the  t -table  (Fisher,  1930 )  for  11  degrees  of 
freedom  (ll-l  paired  values)  where  it  is  found  that  the  observed  value  lies  "between 
1  =  0.02  and  P  =  0.01. 

The  "Z"  table  devised  by  Student  is  sometimes  ,.!.sed.  He  designed  "Z"  as  the  ratio 
of  the  mean  difference  to  the  standard  deviation  of  the  mean  difference,  i.e., 
Z  =  Jc  where  s '  =  /S_(  x-x)  2 

B1  V    dJ  ' 

Student  (1926)  calls  attention  to  the  fact  that  the  "Z"  table  should  be  enter 3d  with 
N-i  degrees  of  freedom.  As  mentioned  previously,  his  "Z"  table  is  a  one-way  table. 
The  Z-value  can  be  transformed  to  "t"  as  follows: 

( c )  Application  of  the  Pairing  Method 

The  application  of  "this  method  is  highly  desirable  for  making  comparisons 
between  pairs  of  varieties  or  treatments  when  the  scope  of  the  experiment  is  limited 
to  a  few  pairs  of  observations.   It  is  useful  for  simple  tests  such  as  nested  vs. 
untreated  where-  only  two  or  three  things  are  being  compared.   In  plot  work,  the 
method  can  only  be  used  to  remove  soil  heterogeneity  where  the  plots  are  physically 
paired,  i.e.,  adjacent . 
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Questions  for  Discussion 

1.  What  is  the  basis  for  using  statistics  for  generalization? 

2.  Distinguish  between  a  priori  and  statistical  probability. 

3.  Give  two  basic  theorems  where  several  probabilities  are  involved. 

k.   What  is  the  geometrical  significance  of  the  standard  error?  Its  significan.ee  in 
practical  problems? 

5.  Why  is  a  difference  said  to  be  statistically  significant  when  it  is  two  or  more 
times  the  standard  error? 

6.  Is  it  correct  to  say  that  standard  error  is  a  measure  of  experimental  error? 
Explain. 

7.  What  is  the  difference  between  a  one-way  and  two-way  table  in  the  calculation  of 
probability?  Interpret  probabilities  calculated  from  each  kind  of  a  table. 

8.  How  do  odds  differ  in  on^-way  and  two-way  tables? 

9.  How  can  odds  be  transferred  from  a  one-way 'to  a  two-way  basis?  Explain  the  dif- 
ference in  interpretation. 

10.  Explain  the  difference  between  the  standard  error  of  a  single  observation  and 
the  standard  error  of  the  mean.  Give  the  formula  for  each. 

11.  What  is  the  formula  for  the  standard  error  of  an  average  of  an  average?  Standard 
error  of  a  difference? 

12.  What  is  the  relation  of  the  standard  error  to  the  probable  error?  Why  do  most 
statisticians  prefer  to  use  standard  error? 

13.  Why  are  special  methods  used  for  small  samples? 
lb.   What  is  meant  by  "degrees  of  freedom"? 

15.  Who  was  "Student"?  What  were  some  of  his  contributions  to  statistics? 

16.  What  is  the  meaning  of  Fisher's  "t"? 

17.  What  was  Student's  "Z"?  How  can  it  be  transformed  to  "t"? 

18.  How  is  the  standard  error  calculated  for  the  means  of  two  independent  small 
samples  drawn  from  populations  with  equal  standard  deviations? 

19.  What  is  Student's  pairing  method?  How  does  it  differ  from  other  methods  of  cal- 
culating standard  errors? 

20.  Under  what  conditions  can  Student's  pairing  method  be  used?  What  are  some  of  its 
limitations? 
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PROBLEMS 


1.  (a)    If   the  mean  of  a  population  is  21.65,    and  o~    -■     3'21\>    determine  the  probabili- 

ty  that  a  variate  taken  at  random  will  he  greater  than  28.55  or  less  than 
1V.75. 

(h)  Determine  d/tr  f or  P  -=  0.01,    0.05,    and  0.;30. 

2.  Suppose  the   odds   in  a  1-way  table  are  87:1.     Transform  them  to  a  2 -way  basis. 

3.  In  a  wheat   variety  test,   yields   in  bushels  per  acre  were  as  follows: 

Karired    :      54.6,    53.7,    68.0,    55,2,    58.5,    62.1,    56.7 


64, 


.2,   57.5. 


=     53.3 


Cheyenne:    66.3,    60.9,    64.3,    67.6,    63.8,    62.2,    63.4 
60.6,    67.2,    55.3,  x     =     64.3 

Calculate:   (a)  The  standard  error  for  a  single  plot  (s),  and  the  standard  error 
of  the  mean  (a  g)  for  each  variety;  (b)  The  standard  error  of  the  difference 
between  the  two  varieties  (07^);  and  (c)  Determine  whether  or  not  the  difference 
between  the  varieties  is  statistically  significant.  Assume  the  population  stand- 
ard deviations  are  different. 

4.  The  yields  of  two  varieties  in  bushels  per  acre  are  as  follows  for  several  repli- 
cations: 

Variety  A:   58. 40,40,42,39,35,32/26,42,  and  44. 
Variety  B:   37,J>7,40,40,32, 30,  and  31. 

Compute  a  pooled  estimate  of  the  standard  error  (s)  of  the  two  varieties,  compute 
t,  and  determine  whether  or  not  the  varieties  differ  significantly  in  yield  by 
reference  to  Table  II  in  the  appendix. 

3-  Two  varieties  of  small  grain,  Big  Four  and  Great  northern,  were  grown  each  year 
in  adjacent  plots  from  1912  to  1020.  The  yields  are  given  below. 


Yields  in  Bushels  per  Acre 


Year 


Great  Northern 


Bis  Four 


1912 

1913 
191.4 

1913 
1916 
1017 
1918 

1919 
1920 


71.0 
73-9 
48.9 
78.9 
43.5 

4  (  .  V 
63.O 

48.4 

43.1 


54.7 

60.6 
45.1 

71.0 

40.9 
45.4 

53-4 

41.2 

44.8 


Which  varieties  yield  higher?  Is  this  difference  significant  • 
(a)  means  of  two  independent  samples?   (b)  Paired  Samples? 
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6.  The  grain  yields  in  grama  per  plot  for  spring  wheat  irrigated  at  the  tillering 
and  jointing  stages  were  as  follows  for  1921  to  1923  (incl.): 


Year 


Plot 


Tillering 


Jointing 


1921 


A 

B 
C 
D 


155 
232 
2^3 
257 


281 
202 
271 

265 


1922 


A 
B 
C 

D 


1+59 
332 
3to 
312 


366 
)+o8 
396 
366 


1923 


A 
B 
C 
D 


513 
5oi 

563 

3U6 


602 
635 
593 
539 


3 -Year  Average 


360 


too 


Determine  whether  or  not  irrigation  at  tillering  results  in  a  significantly  higher 
yield  than  irrigation  at  the  jointing  stage.  Consider  the  values  paired. 


CHAPTER  VII 

THE  BINOMIAL  DISTRIBUTION  AMD  ITS  APPLICATIONS 

I .  The  Binomial  Distribution 

Suppose  that   "p"   is  the  probability  that  an  event  will  occur  in  one  trial,   and  "a" 
the  probability  of  failure  of  that  event  to  occur.     Then.,    it   can  be  shown  by  means  of 
the  two  theorems  on  probability  that  the   successive  terras  of  the  binomial  expansion 
will  give  the  respective  probabilities  that,    in  "n"  trials,   this  event  will  occur 
exactly  N,  N  -  1.  N  -  2.    .....    or  0  times.     The  binomial  expansion  is  as  follows: 


(p  4-  q)N     -     pN     +  jj    .   pN-lq   ...  H(N-l) 

1.2 


<l 


,qN 


h-  N(N-l)(N-2)pN-^     +      .,..,. 
1.2.3 

where  evidently  p  +  q.  =  ] 

Then,  the  probability  of  exactly   X -occurrences  in  N  trials  is; 

X  I  N  -  X! 
where  NI  =1.2.3 N. 

This  expansion  is  called  the  Bernoulli  series  or  distribution.  When  p  --  q_,  the 
binomial  distribution  is  symmetrical.  This  distribution  is  similar  to  the  normal  • 
distribution  for  large  values  of  N  but  it  is  unsuited  for  continuous  variables  be- 
cause the  distribution  itself  is  discontinuous. 

(a)  Eeiation  to  Probability 

Suppose  a  die  is  thrown  20  times.  In  this  case,  the  21  terms  of  the  expan- 
sion (l/6  +  5/6)   will  give  the  various  probabilities  of  a  particular  face,  say  six, 
appearing  20,  19,  18,  17  .......  or  0  times. 

Now  suppose  that  the  problem  is  more  complicated.  Let  h   dice  be  thrown  20 
times  and  the  sixes  counted  that  appear  on  each  throw.   In  any  one  throw  the  prob- 
abilities of  getting  i+,3,2,1  or  0  sixes  are  given  by  the  terms  of  (1/6  +  ~)jo)L~'r ,     To 
secure  the  most  probable  results  of  the  experiment,  multiply  each  of  these  probabili- 
ties t-r   20.  The  probability  for  sixes, 

P/M  =  1/1296  times  20  =  0.016 

p/,\  =  20/1296  times  20  ----  O.308 

P/2)  =  150/1296  times  20  =>  2.J10 

?(1}  =  5OO/1.296  times  20  =  7.7IO 

?f0\   =  625/1296  times  20  -  9. 61+0 

Then,  the  most  probable  outcome  of  the  experiment  is:  No  sixes,  10  times;  1  six, 
8  times,  2  sixes,  2  times;  J  sixes,  0  times;  and  h   sixes,  0  times. 

( b )  Constants  of  th  Binomial  Distribution 

The  formulas  for  the  more  important  constants  of  the  binomial  distribution 
are  as  follows : 
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Mean  number  of  occurrences,  x  =  Np-------------(i) 

Variance,  o^  =  Npq  ---------------------  (2) 

Standard  error,    a    =    JWpq  ------------------(3) 

Probable  error,   P.E.   =  *  0.67^5    ^/Npa   ------------     (k) 

Mean  proportion  of  occurrence,   p  =  Nipn    +  N2p2     -------     (5) 

Nx  +  N2 

Variance  of  proportion  of  occurrences,  a*-    -   pq  ------  -  (6) 

N~ 
Standard  error  of  proportion  of  occurrences,  o~  =   /pq  -  -  -  (7) 

II.  Applications  of  the  Binomial  Distribution 

The  Binomial  distribution  may  have  a  variety  of  uses  in  comparisons  of  observed  data 
with  an  a  priori  hypothesis  or  the  comparisons  of  two  samples. 

(a)  Comparison  of  Observations  against  an  a  Priori  Hypothesis. 

Suppose  in  a  sample  of  N -trials  of  an  experiment  the  number  of  occurrences  of 
a  given  phenomenon  is  x.  Let  it  be  desired  to  test  this  result  in  accordance  vith  an 
accepted  standard  outcome  of  such  experiments,  the  expected  proportion  of  occurrences 
being  p.  The  expected  number  of  occurrences (X  =  Up),  and  the  discrepancy  will  be  the 
numerical  value  of  x  -  Np  =  d.  Then  the  probability  that  corresponds  to  t  =  d/V  = 
d/VNpq.  may  be  found  vith  the  aid  of  the  t -table  when  N  is  small,  or  with  the  table 
of  normal  curve  areas  when  N  is  large.  It  would  be  equivalent  to  test  the  proportion 
(x/n)  against  the  expected  proportion  (p)  by  the  formation  of  t  =  d/cr  =  (x/n)  -  p 

/Si 

V  N 
A  very  common  application  is  the  comparison  of  observed  data  for  monohybrid  Mendelian 

ratios  with  the  theoretical.  (See  III  below). 

(b)  Comparisons  of  Samples  from  Different  Populations 

It  may  be  desirable  to  compare  the  proportion  of  occurrences  in  two  samples 
from  admittedly  different  populations.  Then  the  samples  provide  the  following  infor- 
mation: 

Sample  I  Sample  II 

Nl  Number  of  cases  or  trials         N2 

x;l  Number  of  occurrences  of  a        x2 

given  phenomenon 
p-j_  =  X]_  Proportion  of  occurrences         ]g>=  x2 

Nx  Ng" 

o~2  =  p.  q.  Variance  of  the  proportion        cr^  =  P2I2 

~N7~  N2 

The  differences  in  proportions  =  d  =  p-i  -  pp. 


The  standard  error  of  the  difference  =  0&  =  J^^  + 


°o2 


Pl^l     +   P2<l2 

nx       iT~ 
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Thus,  t  =  d    =   pi-  112 

?]_q.]_  -  p2^2 

Nx    "No" 

( c )  Comparison  of  Samples  from  Same  Populations 

The  difference  "between  this  problem  and  that  in  (To)  above  consists  in  the 
hypothesis  that  but  a  single  population  is  being  considered.  As  a  result,  the  data 
afforded  by  both  samples  are  combined  to  give  estimates  of  p  and  a  for  the  population. 
Thus,  the  estimated  proportion  of  occurrences  will  be:  p  =  NiP^  +  N2P2,  an('-  ^ie 

N-,  +  N. 


HO 


estimated  standard  error  of  the  population  will  be   s  e=  mq_  ,  where  q_  =  1-p. 

7ITi  -;"  H2 
Then  t  =  p-i  -  p2  may  be  interpreted  as  in  previous  cases . 

s 

III .  Standard  Errors  of  Mendel ian  Ratios 

In  the  analysis  of  genetic  data,  it  is  necessary  to  test  the  significance  of  the  ob- 
served with  the  calculated  counts  obtained  when  certain  theoretical  conditions  are 
postulated.  With  monohybrid  ratios,  the  general  practice  is  to  use  the  binomial  dis- 
tribution, which  is  sometimes  referred  to  as  the  probable  error  of  a  proposition. 
The  ratios  which  may  be  calculated  in  this  work  by  the  binomial  distribution  are: 
1:1,  3:1,  9:7,  13:3,  15:1,  63:1,  and  27:37. 

(a)  Formula  for  Mendel ian  Eatios 

The  standard  error  of  a  Mendel ian  ratio  is: 

o-  =   Vp  (1-P)N   or  751 -  '- (8) 

where  K  =  the  number  of  individuals,  p  =  the  proportion  of  one  group  as  a  decimal 
fraction,  and  1  -  p  =  the  proportion  of  the  other  group  as  a  decimal  fraction, 
(l  -  p  -  q.) .  Some  writers  use  the  formula,  S.E.  =yp.q.N,  where  "p"  and  "q_"  represent 
the  proportions  in  decimal  fractions. 

( "b )  Use  of  Method 

The  binomial  method  can  be  used  in  genetic  data  only  when  two  phenotypic 
classes  are  grouped,  other  methods  being  used  for  three  or  more  classes.   In  the  Fv 
generation  of  a  barley  cross,  200  green  and  72  white  seedlings  were  counted.   It  is 
desired  to  test  these  data  for  a  3:1  ratio. 

Green White Total  (iQ 


Observed  numbers  200  72  272 

Calculated  3:1  ratio 2  ok 63 07? 

Deviation  '+ 

To  obtain  the  calculated  number  for  a  3:1  ratio,  divide  the  total  number  observed  by 
the  combined  possible  number  of  classes  which  is  h   in  this  case,  e.g.,  272 /k   =  68. 
This  gives  the  calculated  value  for  the  white  (or  l)  class.  For  the  green  (or  3) 
class,  multiply  63  by  3.  This  gives  204 

a"  =       VpTi-p)n    =     Vo.75  x  0.25  x'272    *  7,]J+6o 

Next.,  the  deviation  divided  by  the  standard  error  is  computed: 
d/o-   =  '1/7.1^60   =   O.56 


183 

161 

Jkk 

258 

66 

3kh 

75   3.0356 

9.33 

193-5 

150.5 

3kk 

10.5  9.2068 

1.14 

T3 

The  observed  ratio  fits  the  calculated  3>1  ratio  very  well,  indicating  that  a  simple 
Mendel ian  factor  pair  is  responsible  for  the  production  of  green  and  white  seedlings 
It  is  to  "be  noted  that  d/o  is  less  than  2,  which  indicates  that  the  fluctuation  of 
the  observed  ratio  from  the  calculated  may  be  considered  as  due  to  chance.  In  any 
event,  there  is  no  reason  to  reject  the  theoretical  ratio  hypothesis. 

Another  example  may  be  given  for  green  and  white  barley  seedlings. 

Green     White Total & 2 &/<7, 

Observed 

Calculated  3>1  ratio 

Calculated  9*7  ratio 

It  is  apparent  that  the  data  do  not  fit  a  3:1  ratio  as  shown  by  the  high  value  of 
d/cr.  However,  they  fit  a  9* 7  ratio  very  well,  indicating  that  there  are  two  factor 
pairs  involved  in  the  production  of  green  vs.  white  seedlings  in  this  cross. 

(c)  Short -Cut  Tables  for  Computations 

Tables  published  by  Cornell  University  give  the  Probable  Errors  for  Values 
of  N  from  11  to  1000.  Another  set  of  tables  occurs  in  "Mendel ian  Inheritance  in 
Wheat  and  Barley  Crosses,"  by  Kezer  and  Boyack  (1918).  The  probable  error  values 
obtained  from  such  tables  can  be  converted  to  standard  errors  by  the  division  of  the 
probable  error  value  by  the  factor,  0.67^5. 

IV.  The  Poisson  Distribution  as  A  Special  Case 

As  a  rather  special  case  of  the  binomial  distribution,  there  is  an  approximation  of 
what  is  known  as  a  Poisson  distribution.  This  occurs  when  p,  the  probability  of  the 
occurrence  of  an  event,  is  very  small  and  N,  the  number  of  trials,  is  very  large  so 
that  Np  becomes  appreciable.  In  a  Poisson  distribution  the  probability,  Pv-,  of 
exactly  x  occurrences  in  IT  (IT  =  very  largo)  trials,  is  given  by: 

Px  m     e  -(Np) 


x 


i 


Where  e  is  a  constant  (2.718)  and  p  is  the  probability  of  occurrence  in  a  single 
trial,  and  x\   =  1.2.3 x. 

Although  there  are  tables  published  of  those  probabilities,  their  use  is  unnecessary 
in  the  more  coirmion  types  of  application. 

(a)  Constants  of  the  Poisson  Distribution 

For  the  Poisson  distribution,  the  moan  and  variance  are  equal. 

Mean  =  x  =  Np 

Variance  =  cr^  =   Np,  so  that  a  =  -/Np 

(b)  Use  of  Poisson  Distribution 

The  Poisson  distribution  gives  a  basis  for  the  solution  of  many  problems 
that  involve  the  maintenance  of  certain  standards.  Suppose  that  registered  seed 
regulations  state  that  red  clover  seed  must  not  contain  over  a  given  percentage  of 
noxious  weed  seeds  in  order  to  gain  certification.  Suppose  that  from  a  lot  of  seed, 
a  sample  is  taken  of  such  size  that  a  count  of  10  noxious  weed  seeds  corresponds  to 
the  allowable  percentage.  In  this  case,  the  mean  x  =  Np  =  10.  The  standard  error, 
a  = -/Np  =  VlO  =  3-1-  However,  18  weeds  may  have  occurred  in  the  sample  analyzed. 
The  whole  lot  is  rejected  for  registration  because  the  deviation  from  the  mean, 
18  -  10  =  8,  exceeds  twice  the  standard  error,  i.e.,  2  6  -   6,2.  Suppose  that  lb 


noxious  weed  seeds  are  counted  in  a  sample  from  another  lot-.  Now  a.  decision  "becomes 
questionable.  Suppose  that  a  second  sample  is  taken  and  Ik   weed  seeds  counted..  How 
consider  the  two  samples  as  one.  The  mean.,  x  -  ftp  =  20,  and  the  standard  error,- 
cr  =-/Wp  =V20  =  k.c).     Thus,  ik   +  16  =  30  which  differs  from  the  moan  "by  10.  How-; 
ever,  this  lot  would  be  rejected  "because  the  deviation  from  the  mean,  10,  exceeds 
2  cr=  2(1*.. 5)  =  9.0. 


Reference! 


1.  .Anonymous.  Tables  of  Probable  Error  of  Mendelian  Ratios.  Department  of  Plant 

Breeding,  Cornell  University '  (mimeographed) . 

2.  Fisher,  R.  A.  Statistical  Methods  for  Research  Workers "(5th  edition),  pp.  55-72, 

19;A. 

3.  Kezer,  Alvin,  and  Boyack,  B.  Mendelian  Inheritance  in  Wheat  and  Barley  Crosses. 

Colorado  Exp.  Sta.  Bui.  2k$ .      1918,  -  ./;  • 

h.   Miles,  S.  R.  A  Very  Rapid  and  Easy  Method  of  Testing  the  Reliability  of  an  aver- 
age and  a  Discussion  of  the  Normal  and  Binomial  .Methods. 

5.  Robertson,  D.  W.  The  Effect  of  a  Lethal  in  the  Heterozygous  Condition  on  Barley 

Development.  Colorado  Exp.  Sta.  Tech.  Bui.  1.  1932. 

6.  Sinnott,  E.  W.,  and  Dunn,  L..  C.  Principles  of  Genetics, ,  McGraw-Hill,  pp.  371 -375 ■ 

1932. 

7.  Tippett,   L.  H.  C.     The  Methods   of  Statistics.     Williams  and  Nor gate,   pp.   30-33- 

1951. 


Questions  for  Discussion 


1.  Give  the  binomial  expansion  of  (p  ■<-   q)  . 

2.  What  type  of  distribution  is  the  binomial  distribution?  What  are  its  limitations? 
3-  What  is  the  genetic  application  of  the  binomial  distribution?  Its  limitations? 

k.  How  does  the  Pois^on  distribution  differ  from  the  binomial  distribution? 

3.  Under  what  conditions  might  the  Poisson  distribution  urove  useful?" 


Problems  . 

1.  Colsess,  a  white-glumed  barley  was  crossed  with  Nigrinudum,  a  black-glumed  barley. 
The  segregation  in  the  Fo  was  785  black-glumed  plants  and  215  white-glumed  plants. 
What  ratio  best  fits  these  data?  Calculate  a/a. 

2.  The  Eg  segregation  of  a  Colsess  (hooded)  by  Minnesota  90-8  (awnod)  cross  gave  229 
hooded  plants  and  89  awnod  plants.  Determine  the  ratio  that  bsst  fits  these  data, 
and  test  its  fit. 

3.  In  a  cross  between  Colsess  II  and  Colsess  III,  183  green  seedlings  and  161  white 
seedlings  were  observed  in  the  Eg.  Determine  the  ratio  that  best  fits  these  data 
and  test  the  fit . 


CHAPTER  VIII 
THE  X2  TESTS  FOR  GOODNESS  OF  FIT  AM>  FOR  IMPENDENCE 

I.  ThcX2  Test 

So  far,  statistics  like  the  sample  mean  (£)  and  the  standard  deviation  (s>)  have  been 
used  to  express  differences  between  distributions,  either  an  observed  against  a 
hypothetical  distribution,  or  one  observed  distribution  against  another.  However,  in 
such  cases  the  general  form  of  the  distribution  (normal,  binomial,  Poisson)  has  been 
assumed  and  comparisons  have  been  limited  to  values  of  parameters  of  the  distribution. 
The  use  of  moments  such  as  these  might  be  adequate  for  an  accurate  comparison  of  dis- 
tributions were  a  sufficient  number  of  higher  moments  employed.  However,  this  method 
has  the  principal  disadvantage  of  being  tedious  as  well  as  involving  questions  as  to 
the  validity  of  the  sampling  errors  of  higher  moments. 

Many  times  it  is  desired  to  compare  or  te3t  observed  data  with  those  expected  on  the 
basis  of  some  hypothesis.  This  has  been  referred  to  as  a  test  for  "goodness  of  fit." 
Again,  individuals  may  be  measured  or  classified  categorically  with  respect  to  two 
separate  characters  or  conditions.  It  may  be  desired  to  test  these  characters  for 
association.  Both  of  these  general  problems  can  be  attacked  by  use  of  a  statistic 
known  as  X2  (Chi-squared)  calculated  from  the  data  afforded  by  the  sample. 

II.  The  X2  Distribution 

The  X2  test,  to  measure  "goodness  of  fit"  of  observed  results  to  those  expected,  was 
advanced  by  Karl  Pearson  in  1900. 

(a)  Formula  for  X2 

The  theoretical  distribution  must  be  adjusted  to  give  the  same  total  frequen- 
cy as  the  observed.  Then,  when  0  is  the  number  observed  in  any  one  group  or  category 
of  the  experimental  distribution,  and  C  the  theoretically  calculated  number  for  the 
same  group,  based  on  the  hypothesis  that  the  data  follow  some  certain  distribution, 
the  formula  f or  X2  is  as  follows: 


2 

X   =  S 


J (1) 


(0  -  c)21 
c 

where  "S"  is  the  summation  extended  over  all  the  groups  or  classes.  It  is  obvious 
that  the  more  closely  the  observed  number  agrees  with  the  calculated  the  smaller  X2 
will  be.  Further,  all  differences  in  frequency  (0-C)  are  squared,  whether  positive 
or  negative.  Thus,  X2  is  always  a  positive  quantity,  its  size  being  clearly  dependent 
on  the  number  of  groups  into  which  the  distribution  is  separated  and  degree  of  agree- 
ment between  the  several  values  of  "0"  and  the  corresponding  values  of  "C".  There- 
fore, in  the  ordinary  application  of  X2,  the  number  of  degrees  of  freedom  will  be  the 
number  of  groups  diminished  by  the  number  of  restrictions  imposed  on  the  theoretical 
distribution  that  supplies  the  values  of  C .  When  the  only  restriction  imposed  is 
that  the  total  frequencies  of  the  observed  and  theoretical  distributions  shall  be 
equal,  the  degrees  of  freedom  are  one  less  than  the  number  of  groups.  In  other  words, 
where  the  frequencies  are  determined  for  all  groups  but  one,  the  frequency  of  that 
one  is  automatically  determined  by  subtraction  from  the  total. 

(b)  Sampling  Distribution 

The  sampling  distribution  of  X2  has  been  worked  out  so  that  it  is  possible  to 
find  the  probability  (P)  of  obtaining  from  a  hypothetical  population  with  a  given 
distribution,  a  sample  that  shows  a  distributional  variation  from  that  of  the  popula- 
tion which  would  result  in  a  X2  value  as  large  or  larger  than  that  exhibited  by  the 
sample  in  hand.  For  every  value  of  X2;  in  conjunction  with  any  optional  number  of 
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degrees  of  freedom  P  =  1.00  f or  X2  =  0  and,  as  "X5  increases,  P  diminishes.  Sirce  the 
mathematical  relationships  between  X2  and  P  are  complex,  it  is  necessary  to  have 
tables  that  relate  P,  X2,  and  the  number  of  degrees  of  freedom,  for  practical  use. 

( c )  Grouping  Data 

It  is  unwise  to  group  too  finely  or  to  apply  this  test  where  the  data  are  so 
insufficient  that,  for  certain  of  the  groups,  the  expected  frequency  is  small.  This 
condition  very  easily  might  cause  that  part  of  a2  contributed  by  such  groups  to  un- 
duly affect  the  total X2,  This  is  obvious  from  the  mathematical  form,  (0  -  C)r-/C; 
where  C  is  small.  Fisher  (193*0  recommends  that  each  group  should  contain  at  least 
five  individuals  for  the  test  to  apply.  Sometimes  the  tail  groups  with  very  low 
frequencies  should  be  combined. 

111 •  Probability  Tables  for  X2 

o  _ 

As  has  been  stated  the  probabilities  for  Xc-  values  are  obtained  from  tables .   m 

'order  to  use  them,  it  is  first  necessary  to  know  :,n",  the  number  of  degrees  of  free- 
dom in  which  the  observed  series  may  differ  from  the  hypothetical.   It  Is  equal  to 
the  number  of  classes,  the  frequencies  in  which  may  be  filled  arbitrarily.  When  only 
the  totals,  have  been  made  equal,  n  =  n'  -  1,  where  n.'  is  the  total  number  of  classes 
or  groups.   In  contingency  tables,  where  tests  for  independence  are  being  made,  the 
number  of  degrees  of  freedom  is  the  product  of  rows  and  columns  minus  one  in  each 
case  (r  -  I)  (c  -  1)  because  the  hypothetical  and  observed  classifications  are  forced 
to  conform  both  for  row  and  column  totals.  To  quote  Tippett  (1931):   ''Suppose,  in  an 
extreme  case,  there  are  n'  groups  and  we  fitted  a  curve  Involving  n'  constants  which 
were  calculated  from  the  data.;  then  the  two  distributions  would  agree  exactly  and  X^ 
would  be  zero  because  sampling  errors  would  have  had  no  play,"  The  importance  of 
degrees  of  freedom  in  looking  up  the  probabilities  that  correspond  toX2  has  been 
emphasized  by  Fisher  (1922,  1923,  193*0  • 

(a)  Elder-ton  "Table  of  Goodness  of  Fit" 

A  table  was  prepared  by  Elderfon  with  the  values  of  "?"  (probability)  that  a 
deviation  as  great  as  or  greater  than  the  observed. may  be  expected  on  the  basis  of 
random  sampling.  These  values  correspond  to  each  integral  valtIe>or  ^  from  Fto  30-. 
This  table  is  available  in  "Tables  for  Statisticians  and  Biometricians"  by  Karl 
Pearson  (191*+)  •  The  user  must  be  careful  with  bhis  table  because  n'  is  equal  to  the 
number  of  degrees  of  freedom  (n)  plus  one.  The  probability  of  Intermediate  X.--'-  values 
can  be  obtained  approximately  by  interpolation.* 

(b)  Fisher  "Table  of  X2" 

More  recently,  Fisher  (193*0  ^-aG  published  a  table  of  X2  which  uses  degrees 
of  freedom  (n)  directly.   It  gives  values  of  X2  that  correspond  to  special  valv.es  cf 
"?".  Fisher  (193*0)  states:   "In  preparing  this  table  we  have  borne  in  mind  that,  in 
practice,  we  do  not  want  to  know  the  exact  value  of  'P'  for  any  observed  X2,  but  in 
the  first  place,  whether  or  not  the  observed  value  is  open  ;:o  suspicion.   If  rP'  is 
between  0.1  and  0.9  there  is  certainly  no  reason  to  suspect  the  hypothesis 'tested. 


*Note:  For  example,  the  probability  forX2  =  J+.12  determined  from  h   classes  can  be 
Interpolated  as  follows: 

WhonX2  =  k,  P  -  0.26lhbk 
k.  12 

X2  =  % P  ^  0.171797 

Difference  0.12  O.OG9667 

Product          0.12     x     O.O89667  *  O.OlO'JoO 

"?"  value     0. 2611*64  -  O.OIO760  =  0.25070^ 
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If  it  is  below  0.02  it  is  strongly  indicated  that  the  hypothesis  fails  to  account  for 
the.  whole  of  the  facts.  Ve  shall  not  often  go  astray  if  we  draw  a  conventional  line 
at  0.05  and  consider  that  higher  values  of  X2  indicate  a  real  discrepancy."  The 
table  given  by  Fisher  has  values  of  "n"  up  to  30.  Beyond  this  point  it  will  he  found 
sufficient  to  assume  that  -y^X  2  -  y2n-l  is  distributed  normally  with  unit  standard 
deviation  about  zero.  For  example: 

X2  =  35.62,  n  =  32,  V2X2  =  8.^,72n-l  =  7.914-,  Difference  =  0.50. 

Thus,  where  -/2X2  --/in-l  is  materially  greater  than  2,  the  value  of  X2  is  not  in 
accordance  with  expectation. 

(c)  Normal  Probability  Integral  Table 

In  the  special  case  for  one  degree  of  freedom  (n  -  l),  the  probability  can 
be  obtained  from  the  table  of  the  normal  probability  integral  because  X  is  normally 
distributed  for  one  degree  of  freedom.   (See  Table  II,  "Tables  for  Statisticians  and 
Biomecricians")  For  example,  suppose  it  is  desired  to  find  the  probability  that 
corresponds  to  X2  -  3. 200 

X=/X"2"   ^/J^OO  n      1.7639 

In  the-  table  opposite  t  =  1.7889*  the  value  of  the  probability  that  corresponds  to  it 

is  found  to  be  O.9632.  The  value  of  the  probability  for  the  one  tail  will  then  be 

1.0000  -  O.9632  =  O.O368.  On  the  basis  of  a  2 -tailed  table  it  would  be  O.O368  x  2  = 
0.0736. 

A  --  Goodness  of  Fit 

IV.  Uses  of  X2  for  Goodness  of  Fit 

The  X2  test  for  "goodness  of  fit"  can  be  applied  to  data  grouped  into  classes  where 
it  is  desired  to  compare  them  with  a  theoretical  or  hypothetical  ratio.  The  great 
advantage  of  this  test  for  goodness  of  fit  is  that  no  limitations  or  conditions  are 
imposed  upon  the  form  of  the  distribution  under  investigation.  Historically,  the  X2 
test  was  first  ured  to  test  the  goodness  of  fit  of  an  observed  frequency  distribution 
to  a  normal  distribution  of  the  same  total  frequency,  the  same  mean,  and  the  same 
.standard  deviation.   It  is  still  used  effectively  for  this  purpose  when  the  number  in 
the  sample  is  large.  One  sacrifices  a  fit  in  the  tails  of  the  distribution  by  use  of 
the  X2  test,  but  often  the  investigator  is  only  interested  in  the  central  range  which 
the  data  cover.  The  X?   test  is  particularly  useful  in  genetics  to  test  Fg  and  later 
segregations  where  two  or  more  phenotypic  classes  are  involved.  J.  Arthur  Harris 
(1912)  first  called  attention  to  the  value  of  the  X2  test  for  genetic  data. 

V.  Computation  of  X2  for  Goodness  of  Fit 

In  Mendelian  ratios  from  F2  progenies  and  later  generations,  the  common  practice  is 
to  summate  the  numbers  in  each  phenotypic  class  and  to  formulate  a  hypothesis  on  the 
basis  of  the  ratio  obtained  in  order  to  establish  the  number  of  genetic  factors  in- 
volved. TheX2  test  is  used  to  determine  whether  the  deviations  of  the  observed  num- 
bers from  the  calculated  numbers  are  not  due  to  chance. 


(a)  General  Method  of  Computation  • 

In  a  cross  that  involves  two  independently  inherited  Mendelian  factor  pairs, 
a  90:3:1  ratio  is  expected  in  the  F2  generation.  A  segregation  in  the  F2  generation 
of  a  barley  cross  that  involved  long  vs.  short-haired  rachilla  (Ss)  and  covered  ve . 
naked  seeds  (Nn),  gave  results  as  follows:   (Data  from  Bob ert son) 
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Long -Haired  Baohilla  Short -Haired. Rachil la     Total 

Covered  Seeds             Naked  Seeds  Covered  Seed. a  ,      Naked  Seeds 

2061  6I+5  673  256     '  3637 

(SN)                    (Sn)  (bN)     .        (sn) 

The  calculated  ratio  for  a  9:3:3:1  is  calculated  so  that  the  total  of  the  theoretical 
values  equal  the  total  in  the  sample,,  i.e.,  3637  •  The  value  3637  is  divided  hy  16 
(Q  +3  +3+1)  to  give  the  expected  number  in  the  class  with  short-haired  rachillas 
with  naked  seeds,  i.e.,  3637/16  =  227. 3125'.  The  values  on  the  basis  of  expectancy 
for  the  3-classes  can  be  computed  "by  multiplying  227.3125  by  3  =  681.9375;  etc.  The 
results  can  be  put  down  as  follows: 

Observed   Calculated                             „ 
Classes     Ratio No.  (0) No.  (C) 0  -  C (0  -.  C)2    (0  -  C)d/C  

SN        9  206l  2014-5.81  I5.I9  23O.736I      0.1128 

Sn       3  645  '  681. 9k  36.9k  x3.64.5636     2.0010 

sN        3  675  68l,9'4-  6.9*4-  kQ.  1636      0.0706 

_sn_ 1 236  J227 ._1  _  _ 28 . 69      823.1161 __. 6211 

Totals  3637     3637.OO  X2  =  5.3055 

n  =  3  P«  0,lcJ33 

Hence }    the  deviations  from  the  calculated  ratio  cannot  be  regarded  as  significant. 

(b)  Method  for  Two  Classes 

The  X.2  value  may  be  calculated  directly  where  "A"  is  the  number  in  one  class, 
"a"  the  number  in  the  other,  and  "N"  is  the  total  number  in  the  sample  (A  +  a) . 
These  formulae  are  given  by  Immer  (1936)  and  represent  a  transformation  from  the 
standard  method  for  the  computation  of  X.2  for  goodness  of  fit. 

patio  A  :  a  X-  Value 

(A  -  a)2 
1:1  N        - ..__...-_ (2) 

3   :   1  (A  _  _g_2   _  _  „ _  _  „  „ ........ -  (3) 

3N 

9   :   7  (7A  -  9a)2    .  _... (M 

631^ 

m  :  n  (nA  -  ma)  2  •___„___._„__..________„  -  ( 5 ) 

mnil 
The  computation  may  be  illustrated  with  data  which  appear  to  fit  a  5  :  1  ratio. 
A  =  2903,  a  =  936,  and  N  =  3839. 

"*-2  =  (A  -  5e)2  =  (2905  -  280o )2     =   0.7840.   ?  =  Value  close  to  1. 
3N        3  x  3339 

( c )  The~X2  Test  Applied  t o  S everal.  Q-e ne tic  F ainil ies 

In  genetic  data,  Kirk  and  Immer  (1928)  show  that  the  total  class  frequencies 
obtained  by  summation  are  composite  results  which  may  easily  mask  a  serious  lack  of 
consistency  in  numerical  ratios  of  the  separate  families  with  respect  to  agreement 
with  expectation.  To  summate  the  numbers  in  each  class  of  all  progenies  is  to  rely 
on  mean  values  and  thereby  disregard  deviations  from  the  ratio  expected  to  occur  in 
each  family.  This  applies  particularly  where  the  numbers  are  small.  The  smaller  the 
number  in  each  progeny,  the  greater  the  opportunity  to  err  when  the  summations  are 
taken  as  an  indication  of  the  genetic  constitution.  In  such  cases,  a  goodnoss'of 
fit  test  like  Xj2  ±B   required  which  involves  in  its  calculation  deviations  from  ex- 
pectancy for  each  class  of  each  progeny.   It  should  be  mentioned  thatXA-  values  can- 
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not  be  averaged.  However,  they  are  additive  provided  the  number  of  degrees  of  free- 
dom are  properly  taken  into  account. 

VI.  Fit  of  Observed  Data  to- the  Normal  Curve 

The  \2  criterion  is  useful  to  determine  whether  or  not  observed  data  give  an  accept- 
able fit  to  the  normal  curve  or  any  other  assumed  form  of  distribution.  It  is  useful 
where  the  sample  is  large  and  where  the  requirements  f  or  X2  are  fulfilled.  First, 
the  range  of  measures  is  divided  into  an  arbitrary  number  of  classes  so  as  to  meet 
the  number  of  measures  in  the  separate  classes  which  a  valid  use  of  the  X2  criterion 
demands.  Data  on  number  of  culms  counted  on  1+11  wheat  plants  at  the  Colorado  Experi- 
ment Station  are  used  to  illustrate  the  computation.  The  data  are  as  follows: 


X  (Class  center)  1 
f  (Frequency) 2 


3 

5 

7 

9 

11 

13 

15 

17 

2^ 

52 

85 

Ufc 

69 

36 

18 

6 

19    21    23    25    27 
1       2        1        0       1  =  1+11 


2  =  8.9172       s'  =  3.U715.     s'    (corrected  for  grouping)     =  3.1+231. 

(1)  The  data  are  regrouped  in  order  to  have  a  larger  number  of  cases  in  the  tail 
classes. 


Classes 


less  than  1+ 

7 
>  9 

11 

13 

15 

more  than  16 


Class  Range 


1+.0  to  6.0 
6.0  to  8.0 

8.0  toiao 

10.0  to  12.0 
12.0  toll+.O 
ll+.O  to  16.0 


2b 
52 

85 
111+ 

69 
36 
18 

11 


The  class  range  is  reduced  to 

cr -units,   viz.,  2/3.1+231  =  0. 581+3 

The  correction  to  the  mean  above 
8.0  =  0.9173/3.^231  =  0.2680 
cr  -units. 


II     = 


1+11 


(2)  The  next   step  is  to  calculate  the  end  points  of  units  for  the  class   intervals   in 
ct  -unit  s . 


Unit 

Calculated 

Class  rang 

e 

1+.0 

Area  Range 

Frequency 

Frequency 

Less  than 

- 

OO             T,0 

-  1.1+3 

0.08 

32.9 

1+.0     to. 

6.0 

- 

1.1+3  to 

-  0.85 

0.12 

1+9.3 

6.0     to 

8.0 

- 

O.85  to 

-  0.27 

0.19 

78.1 

->8.0     to 

10.0 

- 

0.27  to 

+  0.32 

0.21+ 

98.6 

10.0     to 

12.0 

+ 

O.32  to 

+  O.90 

0.19 

78.1 

12.0     to 

ll+.O 

+ 

0.90  to 

+  1.1+8 

0.11 

1+5.2 

ll+.O     to 

16.0 

+ 

1.1+8  to 

+  2.06 

0.05 

20.6 

more  than 

16.0 

+ 

2.06  to 

+  00 

0.02 

8.2 

Total 


1+11.0 


The  cr  -  value  for  the  class  that  contains  the  mean  is:  0.581+3  -  0.2680  =  +O.3163  cr. 
This  value  is  the  ordinate  for  10.0  while  -0.27  is  the  cr  -ordinate  for  8.0,  these 
values  being  within  the  range,  8.0  to  10.0.  The  other  area  ranges  are  calculated  by 
the  addition  of  O.58  to  determine  the  next  higher  or  lower  range.  For  example,  it  is 
0.32  +  O.58  =  +  0.90  for  the  range  12.0. 

(3)  It  is  now  necessary  to  compute  the  unit  per  cent  frequency  for  each  class  by 

reference  to  a  table  of  the  probability  integral,  (Table  I,  appendix) .  For  exam- 
ple, the  unit  frequency  for  the  range,  -0.27  +  O.32  is  computed  as  follows: 


For  t  =   -0.27, 

P 

=     0.61 

-  0.50 

=     0.11 

t  =  +0.32, 

P 

=     O.63 

-  0.50 

=     0.13 

8o 

The  frequency  per  cent  for  the  distance ^  -0.27  to  +0.32,  is  equal  to  0.11  -1-  0.13= 
0.24.  This  means  that  24  per  cent  of  the  frequencies  would  "be  within  this  range 
or  the  basis  of  the  norma],  curve.  The  other  values  can  he  calculated  in  a  simi- 
lar manner ,  except  that  the  two  values  are  subtracted.  The  last  class,  from  2.06 
to  include  the  remainder  of  the  curve,  is  computed  as  follows: 

For  t  =  2,00,   P  =  0.98  1.000  -  O.98  a   0.02 

(•4)  The  next  step  is  to  multiply  the  per  cent  frequencies  "by  the  number  in  the  sample 
(N)  to  obtain  the  calculated  frequencies.,  e.g.,  (0.08)  (411)  =  J2..9,  etc. 

(5)  The  observed  and  calculated  frequencies  are  now  compared  by  use  of  the X>  cri- 
terion. 

Observed      Calculated 
Class  range         Frequency     Frequency       0-C        (0-C)2      (0-C)2/C 

less  than  4.0  2b  32.9  -6.9  47. 6l  1.4471 

4.0   to  6.0  32  49.3  2.7  7.29  0.1479 

6.0   to  8.0  85  '-73.1  6.9  47.61  O.6096 

8.0   to  10.0  114  98.6  13.4  237.1.6  2.4033 

10.0   to  12.0  00  78.I  -9.1  82.81  .  I.O603 

12.0       to  14.0  36  43.2  -9.2  84.64  ■           I.8726 

14.0      to    io.o  18  20.6  -2.6  6.76  0.3282 

mere     than     16.0  11  _3«2_  _2'2_ 7-e4   .  O.o^ol 

Totals  411  411.0  •  X2     =  3.8271 

P  -   O.II72 
There   are  8  classes,   but   only  3  degrees   of  freedom  available  because  3   constants  have- 
been  used  in  fitting  the  da.ta  to  the  normal   curve.      lb   is   obvious   in  this   case  that 
the  probability   (P)    is  greater  than  0.05 .     Thus,    the  underlying  distribution  of  the 
data  may  have  been  normal.     This  method,  applies  to  fitting  observed  data  to  any 
hypothetical  distribut ion . 

'VTI.  Partition  of  X2   into  it a_  Component s 

When  a  discrepancy  in  a  theoretical  genetic  ratio  on  the  basis   of  independent   inheri- 
tance occurs,    it  may  be  produced  either  by  linkage   or  a  departure  from  the  3    :    - 
ratios.     Fisher  (193*0   bias   suggested  a  method  whereby  X2   can  be  partitioned  into  its 
components  to  determine  the   source   of  the  discrepancy.      In  a  barley   cross,    the  Y,p 
data  were  as  follows  for  non-tipped  and  tipped  lateral  spikelets  (Tt),.and  for  hoods 
and  awns   (Kk) : 

TK Tk  tX tic Total 

" (a)   '  ~     (b)        '   "(c)"        (d) 

Observed  No.         I496           315             550         216       2777 
Calculated  No. 1 562  .'06   ■      520. 69  520. 60 173-^6 2777-00 

X.2  ^  14.8855  P  -  very  small 

To  determine  whether  or  not  the  discrepancy  is  due  to  linkage,  the  "X.2  value  is  par- 
titioned into  its  components  as  follows: 

x  =  non-tipped  vs.  tipped  =>  (a  +  b)  -  3(c  +  d) 

-  (1J4.96  +  515.)  -  5(350  4  216)  =  -287 

y  =  hoods  vs.   awns  =    (a  4-  c)    ■-  3(1  +  d) 
=   (1496  +  550)    -  3(513  +  21b)   *    -l'i-7 

z  =   interaction  or  linkage  =   a   -3b    -  3c  +  $'d 

-  1496  -  3(515)    -  5(530)    -v   9(216)   *   +243 
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Thex2  values   can  be   computed  for  each  component  as  follows: 
1.  non-tipped  vs.   tipped: 


X2  =jc£  =        (287): 


3n 


3(2777) 


2 .  hoods  vs .   awns : 

X2  =  y£  =       (l^T)2. 
3^"        3(2777) 

3.  interaction  (or  linkage): 

.-X?  =_z?_=       (243)2 
9n        9(2777) 


=  9.8870 


=  2.5938 


=  2.1+017 


The  data  can  he  brought  together  in  a  summary  form  as  below: 
Factor  Pairs  d.f . 


X2 


Non-tipped  vs.  tipped  (Tt) 
Hooded  vs.  awned  (Kk) 
Interaction 


9.8870 
2.5938 
2.1+017 


0.0016 
0.107^ 
0.1212 


Totals 


11+ .8825 


very  small 


Thus,  the  3  •  1  ratio  for  non-tipped  vs.  tipped  is  found  to  account  for  a  Large  part 
of  the  high  X2  value.  There  is  no  indication  of  linkage. 

B  --  Test  for  Independence 

VIII .  Independence  and  Association 

When  observations  have  been  classified  in  two  ways,  it  may  be  desirable  to  determine 
whether  or  not  the  two  variables  are  associated.  The%2  test  for  independence  has 
been  used  for  this  purpose.  Two  variables  are  said  to  be  associated  when  the  numbers 
in  the  cells  of  the  contingency  table  are  not  randomly  distributed.  Contingency 
tables  may  be  manifold,  there  being  (r  -  1)  (c-1)  degrees  of  freedom  where  there  are 
"r"  rows  and  "c"  columns.   In  tests  for  independence^  .the  subtotals  of  the  classes 
into  which  the  variates  are  distributed  are  used  to  determine  the  theoretical  fre- 
quencies with  the  result  that  the  subtotals,  must  be  considered  as  constants  in  the 
determination  of  degrees  of  freedom.  For  example,  the  degrees  of  freedom  in  a  2  by 
2  contingency  table  are  one .  The  value  of  X2  is  referred  to  a  X2  table  to  determine 
the  value  of  "P"  that  corresponds  to  it  for  the  number  of  degrees  of  freedom  in  the 
contingency  table.  A  "P"  value  greater  than  0.05  indicates  lack  of  proof  of  associa- 
tion between  two  variables,  i.e.,  they  may  be  independent.  The  x2  criterion  has 
proved  useful  as  a  test  for  the  independence  of  two  genetic  factor  pairs. 

"DC .  Calculation  of  Independence  or  Association   '. 

The  test  for  independence  can  be  made  when  the  data  are  compiled  either  in  simple 
l+-fold  (2  by  2)  or  manifold  contingency  tables. 

(a)  The  Manifold  (m  by  n)  Contingency  Table 

The  computation  can  be  illustrated  by  some  Fg  data  (Hayes)  in  an  oat  cross, 
Bond  x  D.C,  where  it  was  desired  to  learn  whether  or  not  there  was  any  association 
between  the  reaction  to  stem  rust  and  to  crown  rust.  The  data  are: 
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Stem  Rust  Reaction 
Resistant Susceptible Totals Ratio 


Crown 

Rust 

Reaction 


Resistant 

Susceptible 

Intermediate 


50 (57. 2 W 
119(112.1126) 

2  5(24.  658O) 


22(14.7550) 

22(2,8.8950) 

6(6.5500) 


72 
141 
•51 


0.2951 
0.5779 
0.1270 


Totals 


194 


50 


244 


1.0000 


In  case  that  the  amount  of  stem  rust  infection  has  no  influence  on  the  amount  of  crown 
rust  infection,,  the  244  observations  would  bo  expected  to  be  distributed  at  random  in 
the  6  cells  of  the  contingency  table,  with  the  restriction  that  they  must  add  up  to 
give  the  totals  in  the  table  (See  Tippett,  1951.,  p.  69).  The  probability  that  an 
observation  will  fall  in  row  No.  1  is  "(2./2kh,    and  that  it  will  fall  in  column  No.  1 
is  194/2.44 .  Then,  the  probability  that  an  observation  will  fall  in  the  first  cell  Is 
(72/244)  (194/244).  The  expected  number  of  individuals  in  that  square  on  the  basis 
of  independence  is  the  probability  multiplied  by  the  total  number,  i.e.,  (72/244) 
(iqh/2kh)   (244)  =  57,2^94. 

The  various  steps  in  the  computation  are  as  follows: 

(1)  The  ratio  of  rows,  for  row  No.  1,  is  72/244  =  0,2951. 

(2)'  The  theoretical  frequencies  can  be  obtained  by  the  multiplication 
of  each  of  the  ratios  for  rows  by  each  of  the  subtotals  for  columns, 
e.g.,  0.2951  times  194  =  57.2494  for  cell  No.  1.  The  other  values 


are  computed  in  a  similar  marine] 


In  this  case,  it  is  necessary  to 


compute  the  value  for  only  one  other  cell,  i.e.,  0.5779  times  194  = 
112.1126.  The  other  values  can  be  obtained  by  subtraction  from  the 
marginal  totals . 
(3)  The  observed  and  theoretical  values  are  then  compared  by  use  of  the 
Xr   criterion. 


01 

served 

Calculated 

Nc 

. 

No. 

0-C 

(0-C) 2 

(o-c)2/c 

50 

57.2494 

7.2494 

52.5538 

0.9180 

119 

112.1126 

6 . 8874 

47  .if  363 

0.4231 

25 

2k . 6580 

0.5620 

0.1310 

0.0053 

22 

1.4.7550 

7.2^50 

52.4900 

3.3574 

22 

28.8950 

6.8950 

47.54IO 

1.6453 

6 

6.5500 

0.3500 

0.1225 

0,0193 

244         244.oooo  X2  -  6.5684 

n  =  (n  -  1)  (m  -  1)  =  2  P  =    O.O387 

Thus,  the  indications  are  that  there. is  an  association  between  the  reactions  to  stem 

rust  and  to  crown  rust. 

(b)   The  2  by  2   or  *4-Fold  Table 

The  4-fold  table   Is   often  used  to  test  tho   independence   of  two  genetic  factor 

pairs.     The  independence  of  the  two  3    :  1  ratios   can  be  tested  as   follows: 


K. 


k 


Total 


V 
v 


a  =   142 
c  ~     49 


b  -  4j 

d  =   15 


a  +  b  =   185 
c  •*-  d  =     64 


Totals 


a  +  c  •-   191     b   +  d  =   58 


N 


_    oiir 


y 


The  value   of  x2   can  be  determined  by  the  method  outline   in  (a)   above,    or   it   can  be 
•■coiaputed  by  a  short-cut  formula  given  by  Fisher   (193'4)  • 


■2  _ 


U 


N  (ad  -  be)2 
c)  (b  +  d)  (a 


+  b)  (c  +  d) 


83 
(6) 


■  249  Ha^)(g),,:.,(^)^9)3 


U9lJ(58)(l85)(5IsT 

(2^9) (529) 
(191)(58)(185)(6U) 


151,721 


131,163,520 


0.0010 


whenX2  =  0.0010,  P  =  value  close  to  1. 

(c)  Inadequacy  of  X2:  Correction  for  Continuity 

When  the  several  categories  are  represented  by  relatively  small  frequencies, 
the  value  of  X2  often  gives  inaccurate  results  because  the  corresponding  probability 
of  occurrence  is  too  small.  This  is  particularly  the  case  in  a  2-  by  -2  classifica- 
tion. Yates  (193^)  &as  developed  a  correction  that  should  be  applied  in  such  cases. 
This  correction  simply  amounts  to  the  reduction  of  each  numerical  value  of  each  (0-C) 
determination  by  l/2.  Thus,  in  the  example  above,  the  correction  applied  to  "X2  is  as 
follows: 


"X  (corrected)  = 


N(ad  -  be  -  N/2)2 

(a  +  c)(b  +  d)(a  +  b)(c  4  d) 

2k9    C(3*2)(15)  -  (W(k9)    -  2U9/glg 


(7) 


(19D(58)(l85)(6l+) 


(2k9)   (-IO6.5)2 
131,163,520 

0.0215 


-   2,82^,220.25 
131,163,520 

P  -   value  close  to  1 . 


In  this  case  even  tho  the  frequencies  may  be  fairly  large,  it  is  quite  proper  to 
introduce  the  correction.  However,  the  larger  the  number  of  categories,  the  less  • 
important  is  the  correction. 

X.  The  Null  Hypothesis  andX2 

It  is  important  to  understand  something  about  the  philosophical  and  logical  bases  -for 
the  making  of  inferences  from  the  X2  as  veil  as  from  other  criteria  for  significance. 
The  basic  premise  involved  in  every  test  for  significance  is  a  negative  premise  and 
has  been  termed  the  null  hypothesis  by  Fisher  (1937) .  It  is  simply  a  tacit  assump- 
tion of  agreement,  such  as  agreement  between  standard  deviations  of  distributions,  and 
agreement  between  distributions  as  a  whole.  In  association  and  correlation  studies 
the  null  hypothesis  is  construed  to  mean  independence  or  lack  of  association  between 
characters  or  conditions  under  investigation.  This  tacit  negative  premise  can  never 
be  proved.  For  example,  it  is  impossible  to  prove  statistically  that  two  samples  came 
from  the  same  population,  or  that  the  population  which  afforded  the  samples  under 
comparison  possess  the  same  means  or  other  statistics.  It  is  impossible  to  prove 
statistically  that  two  characters  or  conditions  are  independent  or  devoid  of  associa- 
tion. To  draw  such  conclusions  would  simply  be  to  reiterate  what  was  originally  only 
assumed  to  be  true. 

Therefore,  definite  conclusions  can  be  drawn  only  when  the  criteria  for  significance 
have  been  met,  such  conclusions  being  positive  in  nature .  The  investigator  is  able 
to  prove  differences  to  exist,  association  to  be  present,  etc.  In  short,  he  is  abl 
to  prove  the  falsity  of  the  null  hypothesis  but  never  its  truth. 
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Questions  for  Discussion 


1.  What  are  the  uses  of  the  X2  criterion? 

2.  What  conditions  must  be  fulfilled  in  the  use  of  the  Xs  test? 

3.  What  Is  the  range  of  X2  values?   "P"  values? 

4.  Give  a  rule  for  the  number  of  degrees  of  freedom  in  a  "goodness  of  fit"  test. 
What  is  it  for  a  contingency  table? 

5.  What  precautions  are  necessary  in  the  grouping  of  data  for  a  "goodness  of  fit" 
test?  Why? 

6.  How  do  the  Elderton  and  Fisher  tables  for  X.2  differ?  What  precautions  are. 
necessary  in  the  use  of  each? 

7.  Interpret  "P"  =  0.50  on  the  basis  of  goodness  of  fit. 

8.  In  what  special  case  can  the  normal  probability  integral  table  be  used  to  compute 
"P"?  Why? 

9.  Who  is  responsible  for  the  X.2  test?  For  what  was  it  first  used? 

10 .  Explain  how  to  compute  X2  for  goodness  of  fit. 

11.  What  precautions  are  necessary  in  the  application  of  the  X2  test  for  goodness  of 
fit  to  genetic  ratios?  Why? 

12.  In  the  fitting  of  observed  data  to  that  expected  on  the  basis  of  the  normal  curve, 
how  many  constants  are  used?  Which  ones? 

13.  Under  what  conditions  may  it  be  desirable  to  partition  X2  into,  its  components? 

14.  What  does  "P"  =  0.01  indicate  when  obtained  from  a  contingency  table? 

13.  Explain  how  the  -probability  is  calculated  for  a  cell  In  a  contingency  table. 

16.  How  does  the X2. test  for  independence  differ  from  that  for  goodness  of  fit? 

17.  What  is  meant  by  the  null  hypothesis?'  .  ,-, 
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PROBLEMS 

I.  In  a  "barley  cross,  Robertson  (1929)  tested  black  vs.  white  glumes  (Bb)  and  hoods 
vs.  awns  (Kk)  for  a  9  '   3  '•   3  '   1  ratio  in  the  Fg.  His  data  were  as  follows: 

Classes  Observed  No.  Calculated  No. 


Black  hooded  (BK)  2611  2656.7 

Black  awned  (Bk)  920  885.5 

White  hooded  (bK)  860  8P5.5 

White  awned  (bk)  332  poc  *. 

■■----■■-     1  ■    ■     1  1     ,  ,■  1     1  I  1  1  .    ■  ii  1  -'   -■  *  *  r — . — 

Totals  ^723  1+723 

Calculate  X2  and  interpret  it.*  Do  these  data  fit  a  9  '•  3  '   3  :  1  ratio  for 
independent  inheritance? 

II.  Some  data  on  hoods  and  awns  (Kk)  and  covered  vs.  naked  (Nn)  in  barley  were  test- 
ed for  a  9  :  3  :  3  :  1  ratio.  The  observed  and  calculated  results  were  as 
follows:  (Data  from  Robertson,  1929) 

Calculated  No. 

~2bTS 
682 

682 
227 

Totals  3637  3637 

Apply  theX2  test  and  interpret  it. 

III.  An  Fg  segregation  of  a  barley  cross,  Colsess  x  Minnesota  8U-7,  gave  these  re- 
sults: (Data  from  Robertson,  1929) 


Classes 

Observed  No. 

Hooded  covered 

(KW) 

1969 

Hooded  naked 

(Kh) 

631 

Awned  covered 

(kN) 

737 

Awned  naked 

(kn) 

250 

Classes 

Obi 

served  No. 

Hooded  green 

(KF) 

931 

Hooded  chlorina 

(Kf) 

326 

Awned  green 

(kF) 

326 

Awned  chlorina 

(kf) 

119 

(a)  What  ratio  fits  these  dotal   (b)  Apply X2  test  and  interpret  it.  (c)  Calcu- 
late the  probability  both  from  the  table  by  Fisher  and  from  the  table  of 
Elderton. 

IV.  In  the  F2  of  a  certain  barley  cross  there  were  2U9  plants  with  high  fertility  of 
the  lateral  spikelets  and  67  with  low  fertility.  Test  these  data  for  a  3:1 
ratio  by  the  X2  test  for  goodness  of  fit. 

V.  A  second  generation  segretation  in  a  barley  dihybrid  for  high  and  low  fertility 
(Hh)  and  for  black  and  white  glume  color  (Bb)  gave  counts  as  follows: 

HB        Hb       hB        hb 


15^7      568      I+78      184 


*Note:  Statement  for  P:   "A  worse  result  might  be  expected  on  the  basis  of  random 
sampling times  in trials  . " 
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When  these  data  were  tested  for  a  calculated  9:  3-  5'  1  ratio, X2  was  3.5718 
with  P  =  O.0365.  Partition  X2  into  its  components  and  determine  whether  the 
discrepancy  is  due  to  the  individual  7j>    :  1  ratios  or  to  linkage. 

VI.  Some  Fg  oat  plants  were  classified  on  the  hasis  of  crown  rust  and  stem  rust 
■resistance  as  follows: 

Stem  Rust  Reaction 
Resistant       Susceptible 


Crown       Resistant  66  hj>  109 

Rust        Susceptible  75  "" 2k  99 

Reaction     Intermediate         17  5  22 


Totals  158  72;     '  ,        2.30 

Use  the  X2  test  f  02-  independence  to  determine  whether  or  not  there  is  an  asso- 
ciation "between  the  reaction  to  stem  rust  and  crown  rust . 


CHAPTER  IX 


SIMPLE  LINEAR  CO.REELATIQK 


I.  Nature  of  Correlation 


So  far,  statistical  analysis  has  dealt  with  a  single  set  of  observations  to  measure 
a  single  character.  It  is  now  desirable  to  consider  two  such  sets  of  observations 
that  measure  two  different  characters.  These  observations  are  such  that,  to  any- 
observation  in  one  set,  there  is  naturally  paired  a  corresponding  observation  of  the 
other.  One  naturally  inquires  as  to  whether  there  exists  any  association  or  connec- 
tion between  the  measured  characters.  Such  association  exists  when  an  abnormality^- 
in  one  character  tends  to  be  accompanied  by  an  abnormality  in  the  other.  The  charac- 
ters are  said  to  be  correlated  when  such  is  the  case.  For  example,  height  and  weight 
in  human  beings  are  said  to  be  correlated.  In  the  aggregate,  tall  persons  are  heavier 
than  short  persons. 

To  condense  what  nas  been  said  into  a  precise  definition,  it  may  be  stated  that  two 
characters  are  correlated  when,  to  a  selected  set  of  values  of  one,  there  correspond 
sets  cf  values  of  the  other  whose  means  are  functions  of  those  selected  values. 

II.  Description  of  Correlation 

A  graphical  representation  of  the  totality  of  paired  observations  can  be  obtained  by 
the  treatment  of  each  pair  of  measurements  as  the  rectangular  coordinates  of  a  point. 
Such  a  diagram  of  scattered  points  is  called  a  scatter  diagram.  To  illustrate,  one 
may  consider  20  pairs  of  observations  that  relate  length  (in  inches)  to  weight  (in 
ounces)  of  ears  of  corn: 


Length  (x) 


Weight  (y) 


Length  (x) 


Weight  (y) 


2.5 
2.5 
3.0 
k.Q 
k.5 
5.0 

5.5 
6.0 
6.0 
6.5 


3.5 
3.0 

5.0 
7.0 

5.5 

8.0 

8.0 

10.0 

T.o 
10.5 


6.5 
7.5 
8.0 
8.0 
8.0 

8.5 
9.0 
9.0 
9.5 
10.5 


b.5 
10.0 

8.0 
10.0 
12.0 
13.0 
12.0 
1^.0 
13.0 
lh.O 


Mean  length  (x)  =  6.5  inches.  Mean  weight  (y)  -     9.0  ounces 
From  these  pairs  of  measurements  a  scatter  diagram  can  be  made  as  follows: 


1 Abnormality  refers  to  -  deviations  from  the  mean, 
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QP 


d 


Ear  length  (:ln.) 

i+.  56   7   89  1.0 


11 


Ear 
weight 
(oz.) 


5 
6 

I 

3 

9 
10 

11 
12 

13 
1.1+ 


•  •  • 

•  •  * 

I 

1 , • * 


J   =  9-0 


x  -■-   b  . ' 


From  the  diagram,  it  is  clear  that  the  horizontal  and  vertical  lines  that  represent 
the  moan  length  and  weight  of  the  ears  in  the  sample  separate  the  plane,  'in  which  the 
points  are  plotted,  into  four  regions  or  quadrants.  It  is  also  evident  that  most  of 
the  points  fall  into  two  of  these  regions,  i.e.,  those  which  describe  the  abnormali- 
ties in  regard  to  the  characters  to  be  of  the  same  typo  above  the  average  and  below 
the  average.  Thus,  there  appears  to  exist  a  direct  or  positive  correlation  between 
the  characters . 

The  totality  of  points  that  form  the  scatter  very  often  possess  the  rough  geometrical 
form  of  an  ellipse.  The  position  of  the  ellipse  indicates  the  type  of  association, 
i.e.,  whether  positive  (direct)  or  negative  (inverse).  The  shape  of  the  ellinse 
roughly  estimates  the  degree  of  correlation.  The  characters  are  closely  related  when 
the  ellipse  is  narrow.  A  diagramatic  representation  of  correlation  is  given  in 
figures  A,  B;  and  0.1 


ISome  statisticians  use  the  first  quadrant  in  correlation  analysis  while  others  use 
the  fourth. 
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Figure  A 
Low  Correlation 


Figure  B 
Positive  Correlation 


Figure  C 
Negative  Correlation 


The  signs  for  the  quadrants  are  depicted  in  figure  A.  It  is  noted  that  the  values 
of  x  above  the  mean  (x)  are  positive,  -while  those  "below  the  mean  are  negative.  The 
same  applies  for  the  y  values.  The  sign  for  the  quadrant  is  the  product  of  the 
corresponding  marginal  signs . 

There  are  two  methods  employed  to  describe  correlation,  i.e.,  the  correlation  surface 
method  and  the  regression  method.  For  an  account  of  the  correlation  surface  method, 
a  text  on  mathematical  statistics  should  be  consulted. 

A  --  The  Correlation  Coefficient 


III.  Measurement  of  Correlation 

A  precise  mathematical  measure  of  the  degree  of  association  between  two  characters  is 
desirable.  In  any  case,  it  must  be  based  on  an  assumption  in  regard  to  the  mathemati- 
cal functional  relationship  that  exists  between  the  variables.  The  most  important 
measure  is  called  the  coefficient  of  correlation,  symbolized  as  r.  In  the  discussion 
that  follows  it  is  assumed  that  the  association  is  linear,  i.e.,  that  the  variables 
x  and  y  are  related  by  an  equation,  y  =  ax  +  b,  where  a  and  b  are  constants. 

Suppose  one  considers  each  pair  of  measurements  of  the  two  characters  as  an  argument, 
either  strong  or  weak,  for  one  or  the  other  of  two  opposite  theories  of  association 
between  the  two  characters.  These  theories  are  that  the  two  characters  are  related, 
either  positively  or  negatively.   A  linear  relationship  is  said  to  exist  between  two 
characters  when  the  moans  of  the  values  of  one  character  are  plotted  with  the  selected 
values  of  the  other  character  that  correspond  to  them  so  mat  the  resulting  points 
are  well -fitted  by  a  straight  line.  To  measure  the  contribution  of  any  given  pair  of 
measurements  (x,  y)  to  one  theory  of  association  or  the  otner,  one  measuresthe  amount 
cf  abnormality  exhibited  by  the  pair  of  measurements  with  respect  to  each  character 
in  units  of  the  respective "standard  deviation  of  the  samples  provided  by  the  2  sets 
of  variatea,  i.e.,  (x  -  x)/s'y.  When  these  measures  of  the  abnormalities  of  the 
pairs  of  observations  are  multiplied,  i.e., 

(x  -  x)(y  -  y)        "  '  .  ■-•- 

si     si      the  result  gives  a  numerical  measure  of  the  argument  presented 

y     by  (x,  y)  toward  a  theory  of  correlation.  The  product  of  both 
•abnormalities  will  be  positive  when  both -are- of  the  same  type,  either  positive  or 
negative.  Their  product  will  be  negative  when  the  abnormalities  are  opposite  in  type. 
A  numerical  measure  of  correlation  between  the  characters  under  investigation  is  ob- 
tained when  the  procedure  is  repeated  for  every  pair  of  measurements  in  the  sample 
and  the  arithmetic  mean  of  the  several  products  is  found.  The  formula  for  the  correla 
tion  coefficient  (r)  is  as  follows: 


90 

r  - 


'  i-  -s  r*—  *  \  -/y  -  y\     .... •  t.s 

N    I  s'x   /  \  0»y  / -  "  *  '  ' :"W 

It  is  obvious  that  "r"  can  "be  plus  or  minus,  thus  depicting  a  positive  or  negative 
correlation.  It  will  he  shown  later  that  "r"  is  numerically  equal  to  or  less  than 
1.0.  Thus,  the  association  that  exists  "between  two  characters  may  he  strong,  as 
evidenced  hy  a  value  of  "r"  numerically  close  to  1.0,  or  weak  when  "r"  is  close  to  0. 

The  above  statement  must  not  he  construed  too  literally'  hut  in  the  light  of  sampling 
theory. 

IV.  Computation  of  "r"  for  Ungrouped  Data 

The  relation,  r  =  _1   S  •'  x  -  x  \  /y  -  y  A  may  he  transformed  to  many  &iffe?reiiv 

arbitrary  forms  for  computation.  Formulas  which  are  useful  for  .email  samples  are  as 
follows : 

r  X  =  IJjaO   -  N  x  y 

/(Sx*  "Nx2)  (Sy2  I  N=2)    ' 

r  =  S(xy)/N  -  x  y 


r  -  NS(xy)   :   (Sx)(&y) ^ 


/[>3(x2)  -  (Sx)2J  [NS(y2)   -(Sy)2] 


(*0 


Formula  (3)  is  the  one  given  hy  J.  Arthur  Harris,  which  is  direct,  hut  not  suited  so 
well  to.  machine  calculation  as  (2)  or  (h)  . 

The  computation  may  he  illustrated  with  these  data  pn  the  length  of  corn  ears  in 
centimeters  and  their  weight  in  ounces. 


\^  (s'x)2   »  S*2   -  Nx2  and   (s'y)2   =   Sy_ 


.  SP       „i£   .  Nfc 


H  N  -  . 

r    =   i  P(x  -  x)(y  -  y)   =  '  i^   gxy   _- xg.ix),  -iS(xj  ±JULX 

w    s,x  s'v  /lliE - sT)      TIHZZjS3) 

»'       1/fo"  (Sxy  -  N  x  y   -  Ef;'x  y  +  N  x  y  }     =     Sxy   -  H  x  y 


l'/w   V    Sx2    -  Hx2)    (3y2   -  Hy2)  V    fex2    -  Nx2)    (Sy2   -  Hy2) 
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Length  (x) 

vei^rt  (y) 

X2 

y2 

xy 

2.5 

3.5 

6.25 

12.25 

8.75 

2.5 

3.0 

6.25 

9.00 

7.50 

3.0 

-  5.0 

9.00 

25.00 

15.00 

4.0 

T.O 

16.00 

49.00 

28.00 

*.5 

5-5 

20.25 

30.25 

24.75 

5.0 

8.0 

25.00 

64.00 

40.00 

5.5 

8.0 

30.25 

64.00 

44.00 

6.0 

10.0 

36.00 

100.00 

60.00 

6.0 

7.0 

56.00 

49.00 

42.00 

6.5 

10.5 

14-2.25 

107.62 

68.25 

6.5 

6.5 

42.25 

42.25 

42.25 

7.5 

10.0 

56.25 

100.00 

75.00 

8.0 

8.0 

64.  00 

64.00 

64.00 

8.0 

10.0 

64.  00 

100.00 

80.00 

8.0 

12.0 

64.00 

144.00 

96.00 

8.5 

13.0 

72.25 

169.00 

110.50 

9.0 

12.0 

81.00 

144.00 

108.00 

9.0 

14.0 

81.00 

196.OO 

126.00 

9.5 

13.0 

90.25 

169.OO 

125.50 

10.5 

14.0 

107.62 

I96.OO 

147 .00 

S(x)  =130.5 

S(y)   =   180.0 

s(X2)  =  949.87  S(y2) 

=  1854.37  Sfor)= 

1310.50 

x  =  6.5        y  =   9-0 

The  symbols  x  and  y  are  the  means  of  the  x  and  y  arrays.  The  values,  S(x^)  and  S(y2), 
are  the  squared  values  for  each  separate  entry  of  x  and  y,  respectively,  and  the 
summation  of  the  same.  The  value,  S  (xy),  is  the  summation  of  the  product  of  each 
value  of  x  "by  the  corresponding  value  of  y.  In  practice,  only  the  sums  of  the  various 
values  are  recorded  in  machine  calculation. 

The  values  may  he  substituted  in  (2)  as  follows: 

r  =  S(xy)  -  §  x  y  =       1310.50   -  (20)  (6.5)   (9.0) 

7 (Sx2-Nx2)(sy2  -  Uy2)  .  7  (9^9-87  -  845. 00) (1834.37  -  1620.00) 

=  1310'50  -  1170-00     -       140.50     =  0.937 
J   (104.87)  ( 2 14 . 3 7 1  722480.9819 

Those  who  use  this  icrmula  for  the  computation  of  r  are  warned  that  a  serious  error 
may  be  introduced  by  dropping  decimals.  The  means  should  be  carried  out  to  twice  the 
number  of  decimal  places  as  appear  in  the  original  data.  The  formulae  given  above  are 
particularly  valuable  when  K  is  ^•■•pII,  i.e.,  less  than  50. 

V.  Calculation  from  a  Ccrrelai:J.or  Surface 

The  correlation  coefficient  rauy  bs  calculated  from  a  correlation  surface  with  the 
deviations  from  the  assumed  means  taken  on  an  arbitrary  scale.  It  is  necessary  to 
apply  corrections  for  the  means,  standard  deviations,  and  class  intervals.  Fisher 
(1934)  has  made  a  contribution,  to  simplicity  in  the  mechanical  computation  of  the 
correlation  coefficient,  his  nethodl  "being  used  "below.  The  data  are  for  the  correla- 
tion of  total  grain  weight  (x)  in  grams  and  culm  length  (y)  in  centimeters  in  wheat 
plants . 

■*-In  the  determination  of  standard  deviations  where  Sheppard's  Correction  has  "been 
used,  the  uncorrected  standard  deviations  should  he  used  in  computing  r. 
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Table  1.  Correlation  Table  for  Grain  Weight  and  .Culm  Length  in  Wheat. 


Cud-m^ 

Gr.  Wt. 

9.5 

29.5 

^9.5 

69.5 

89.3 

109.5 

129.3 

1^9.5 

169.5 

189.5 

209 . 5 

length 

^ 

(y) 

Y     \. 

.    .  .   -s. 

X 

2 

3 

k 

5 

6 

.  7 

8 

9 

10 

11  f.J 

62 

1 

i 

2 

3 

67 

2 

1+ 

1 

0 

7 

72 

3 

1 

4 

3 

2 

p 

2 

l)+ 

77 

k 

2 

8 

1+ 

2 

2 

18 

82 

5 

2 

7 

h 

7 

8 

5 

1 

5* 

87 

6 

2 

6 

12 

5 

8 

17 

6 

1 

1+    61 

92 

7 

8 

23 

2 

2 

16' 

20 

5 

2 

83 

97 

8 

7 

21 

2k 

l*+ 

3 

1 

1+ 

2 

1    83 

102 

o 

2 

22 

3h 

13 

1 

1 

8 

6 

1    88 

107 

10 

1 

3 

32 

26 

6 

T_ 

2 

71 

112 

11 

)+ 

15 

0 

1 

• 

26 

117 

12 

5 

1 

6 

fx 

1 

15 

^3 

9o 

llif 

90 

61 

35 

21 

12 

6  I+.9I+ 

The  data  may  be  arranged  as  fellows 


Table  2.  Computation  of  the  Correlation  Coefficient 


(1) 

(2) 
igth 

( 

culms 

3)     (* 

)     (3) 

(6) 

(7) 

(8) 

(9) 

(10)     (11 

)      (12) 

(13) 

(Ik) 

Av .    1 61 

(y) 

Total 

Prod- 

Ay.   grain  weight   (x) 

Total 

Prod- 

for 

.  uct 

for 

uct 

Class 

Wt. 

Class 

length 

Center 

Y 

% 

Yf 

Y2f 

y 

S  'Xf 

YS'Xf 

Center 

X 

f-V-           Xf-V 

x2fx 

S'Yf 

XS  'Yf 

62 

1 

3 

3 

3 

10 

10 

9.5 

1 

1      1 

1 

3 

3 

67 

2 

7 

ll+ 

23 

19 

29.5 

15   30 

■    .    60 

■  51- 

102 

72 

3 

11+ 

1+2 

126 

1+8 

11+1+ 

1+9.5 

3 

1+3  129 

387 

254 

762 

77 

k 

18 

72 

288 

lh 

296 

69 . 5 

1+ 

96  38I+ 

1536 

696 

2731+ 

32 

5 

3^ 

170 

850 

167 

835 

89.5 

3 

114  570 

2850 

963 

1+815 

37 

6 

61 

366 

2196 

363 

2178 

109.5 

6 

90  5I+0 

3240 

770 

1+620 

92 

1 

83 

581 

1+067 

[i-93 

3I+65 

129.3 

7 

61  1+27 

2989 

1+66 

3262 

97 

8 

83 

661+ 

5312 

1+33 

3621+ 

11+9.5 

3 

33  280 

22^0 

2 1+6 

1968 

102 

9 

88 

792 

7128. 

500 

1+500 

169.5 

9 

21  189 

I70I 

178 

1602 

107 

10 

71 

■710 

7100 

1+Q2 

1+020 

189 . 5 

10 

12    12.0 

1200 

101+ 

loi+o 

112 

11 

26 

286 

31I+6 

161 

1771 

209.5 

11 

6    66 

726 

1+1 

451 

117 

12 

6 

72 

861+ 

1+1+ 

323 

Totals 

h9k 

3172.... 

11108 

27J6 

21^0o_ 

494  2756 

1 69JO 

3772 

211+09 

,qy 

a-y£ 

P5YV 

qTr 

---  -    2~ 

rty 

Y    -     377f 


49I+' 


-  7.63% 


X     -     27J6 
1+01+ 


5.538^ 


The  details  of  computation  are  explained  as  follow 


1.  To  simplify  the  arithmetic,,  the  variables  X  and  Y  are  used  in  place  of  x  and  y, 
respectively.  They  are  related  by:  .     , 
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X     =     (x  -  xx)   /Cx  +  1 

Y    =     (y  -  y^/Cy     *  1 

where  X]_  and  y}_  represent  the  class   centers  of  the  first   classes,   and  Cx>   Cy,   are 
the  class  intervals  of  the  x  and  y  distributions,   respectively. 

2.  The  values  for  Yfy(in  column  k)   are  the  products  of  the  class  values,  Y,   and  their 
respective  frequencies.     The  values  for  column  11  are  computed  in  a  similar  manner, 

3.  The  values  for  Y^fy  (in  column  5)   are  the  products  of  columns  2  and  h  for  the 
respective  values  of  Y.     The  X^fx  values  in  column  12  are  computed  from  columns 
9  and  11. 

h.  The  total  deviations  in  culm  length  (y  variable)  are  shown  in  column  13  for  each 
column  for  grain  weight   (x  variable).     Here  the  symbol   (f)  without  subscripts 
indicates  the  frequency  of  one  cell  of  the  correlation  table,   i.e.,  the  frequency 
of  a  particular  value  of  X  accompanied  by  a  particular  value  of  Y.     The  symbol  S' 
denotes  the  total  over  ,Just  one  array.     It  is  necessary  to  refer  to  Table  1 
(columns.  2  and  3)   bo  compute  these  values. 

1st  Y-array  =  (l)(3)  =3 

2nd  Y-array  =   (l)(l)  +  (10(2)  +  00(5)   +  WW   *  (2)<5)   +  (2)(6)     =     51 

3rd  Y-array  .   (1)(2)  +  (3>(3)  +  WW   +  (5)(7)   ♦  (6) (6)   +  (7)(8) 

+  (8)(7)  +  (9)(2)  +  (10)(1)  =  25^  etc. 

The  values  for  the  X-arrays  in  column  6  are  computed  in  a  similar  manner. 

5.  For  the  product  (XS'Yf)  multiply  each  value  of  the  total  for  length  in  column  13 
by  its  respective  X-value  in  column  9.  For  example,  (3)(l)  -  3,  (51) (2)  =  102, 
etc.  The  values  in  column  7  are  computed  similarly.  It  is  noted  that  the  ultimate 
result  (SXY)  of  the  computations  carried  out  in  columns  (6) (7)  and  (l3)(lU)  is  the 
same.  Thus,  one  provides  a  check  on  the  other. 

6.  The  computed  values  in  Table  2  are  then  substituted  in  formula  No,  2  above:  \y 

r  =  S(XY)  -NXY     _  =  211+09  -  (1*9*0  (7 -6356K5. 5585) 

V  (SX2  -  hx2)(sy2-  NY2)  j  [3.6930  -  (k$k) (5-5385)2J  [31108-(i+9J+)(7".6356F] 

=  21,1+0?  -  26,891.16        ,  3rjt&t 

7(16,930  -  15,155.^5)(51,108  -  28,801.39)  J   4,097,808.00"" 

=     517. 81+  /  202I+.30    =    0.2558 


^The  data  in  the  problem  above  have  been  coded.  Suppose  a  =  assumed  mean,  and  C  = 
class  interval.  It  can  be  shown  that  the  correlation  coefficient  from  coded  data 
is  equal  to  that  from  the  natural  numbers,   viz.,   rxy  =  r^y. 

x  =    (X   -  ax)/Cx     and  y  =    (Y   -  ay)/Cy 
x  -  x  =  CX(X   -  X),    and  s'x  =  Cxs'x 


=     5(x  -  x)(y   -  y)  =       Cx     Cy     S    (X   -  X)(Y   -  Y)   e 


s'x     s'y  cx     Cy  S'X  B,Y 


9^ 

7.  The  true  means  can  "be  computed  from  the  above  values  as  follows: 

y  =  (Y  -  1)G1  +Tft  x  (7.6356  -  1)(5)  +  62  =  95.178O     .   ; 

x  =»  ■  (X  -  1)CX  +  X±  ■  «  .(5.5385,  -1)(20)  +  9.5  -  100.2700 

71 .  Use  of  the  Correlation  Coefficient  for  Error  of  a  Difference 

The  correlation  coefficient  may  he  used  to  reduce  the  standard  error  of  a  difference 
(o~cl)  when  there  exists  a  correlation  "between  the  paired  values  of  two  variables. 
This  usually  enables  one  to  obtajnaignif icance  with  smaller  differences  than  is  pos- 
sible with  the  formula,  o~  d  =  J  a  +  b  ,  given  previously  (See  Chapter  6).  However, 
it  is  seldom  worthwhile  to  apply  the  correlation  formula  unless  "r"  is  large  because 
the  reduction  in  error  is  usually  insufficient  to  justify  the  greater  amount  of  cal- 
culation. The  extended' formula  for  the  standard  error  of  a  difference  is  as  follows: 


o"  d 


=   ya2   +  b2   -  2  ral3   ab    -  -  -  - (5) 


In  this  formula  a  and  b  represent  the  standard  errors  of  the  separate  values  being 
compared,  and  r,   the  coefficient  of  correlation  between  the  separate  measurements  of 
these  quantities. 

The  averages  for  the  heading  and  blossoming  stages  of  irrigation  of  spring  wheat  over 
a  9 -year  period  v  may ■ be  taken  to  show  the  value  of  the  correlation  coefficient  in 
the  reduction  of  the  standard  error.  The  average  yields  of  grain  in  pounds  per  plot, 
together  with  their  standard  errors  (e>-),  are  as  follows:  '  '■ 

Stage  of  Irrigation 
Year Heading Blossoming 

The  coefficient  of  correlation 
was  calculated  for  the  paired  . 
annual  yields  by  the  use  of 
the  formula  for  the  ungrouped 
data  as  explained  in  paragraph 
IV,  viz.,  r  =  +  0.^06. 


1921 

^78 

^52 

h^o  i  h6 

1922 

776 

*  57 

637  *  31 

1923 

lllk 

±  58 

9U7  ±  h-9 

1921+ 

1218 

*  53 

1189  i  52 

1925 

555 

t  28 

524  ■*  27 

1926 

llh 

*  59 

6J+5  -  k? 

1927 

l(&3 

-59 

1035  -  39 

1928 

639 

*  ^ 

6lk  ±  35 

1929 

895 

-113 

839  ±ro7 

Mean 

333 

±  19 

762  ±  18 

°7)     »  -V 

a2  +  b2 

"  2^a 

bab 

=    ^9)2  + 

d/ad  = 

7l/20.r 

f 

3.52. 

(18)2  -  (0.812)(19)(18)  =  20.17 


The   standard  error  of  the   difference,    calculated  without  the  use   of  the   correlation 
coefficient  to  reduce  'che   error,   was   as  follows: 

o-d     =   Va2     +     b2  =  -J(-l9)2    +   (l8)2     -     2^.1T 

d/o-d     =     71/26.17       =     2.71 

vY]      ss   first  true   class  value.      CY,    Cv     -   class   intervals. 

v^P.obertscn,   D.  W.,    et   al .      Studies  on  the  Critical  Period  of  Applying  Water  to  wheat 
Data  from  Colorado  Experiment  Station, 
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It  ie  apparent  how  the  test  comparing  the  averages  of  the  yearly  means  is  strengthened 
"by  taking  into  account  the  correlation  due  to  years, 

VII.  Significance  of  the  Correlation  Coefficient 

The  test  for  significance  is  to  determine  the  probability  (P)  that  the  observed 
correlation  could  have  arisen  by  random  sampling  from  a  population  in  which  the  corre,. 
lation  is  zero.  The  t-test  is  more  accurate  for  small  samples  while  the  standard 
error  test  is  satisfactory  for  large  samples. 

(a)  The  Standard  Error  Test 

In  large  samples  drawn  from  a  population  in  which  the  mean  value  of  r  is 
zero,  the  standard  error  of  "r"  is  given  by: 


ar  = 


1  -jg. (6) 

Vff  -  1 

From  the  standard  error,  r/or  is  computed  to  determine  significance.  When  r/or  is  • 
less  than  2.0,  the  relation  is  probably  due  to  chance  rather  than  to  correlation 
between  the  variables  compared.  Fisher  (193*0  states  that,  in  the  use  of  the  above 
test,  the  value  of  r  itself  introduces  an  error  which  is  magnified  when  r  is  squared. 
Only  in  the  case  of  large  samples  (greater  than  100  pairs  of  observations)  can  the 
standard  error  test  be  used  safely.  Further,  the  distribution  of  r,  at  least  for  the 
stronger  values,  is  so  skewed  that  it  is  unwise  to  make  any  interpretation  of  differ- 
ence in  terms  of  crr  based  on  probabilities  related  to  the  normal  curve. 

(b)  The  t-test  for  Significance 

For  small  samples,  the  distribution  of  r  is  not  sufficiently  close  to  normal 
to  justify  the  ordinary  standard  error  test.  Fisher  (193*0  has  developed  the  "tM  ' 
test  as  a  more  accurate  test  for  significance.  Thin  test  measures  the  probability  of 
obtaining  a  given  value  of  r  from  a  sample  of  paired  values  of  a  given  size  due  to 
chance  alone.  A  value  of  this  probability  of  less  than  P  =  0.05  indicates  that  the 
association  of  the  characters  is  not  due  to  chance,  therefore  being  significant.  The 
formula  for  "t"  for  a  correlation  coefficient  is  as  follows: 

t  =  r  Vl^"2~ (7) 

7  1  -  r2 

In  this  formula  K  =  the  number  of  pairs  of  observations.  The  degrees  of  freedom  for 
the  estimation  of  a  correlation  coefficient  are  N  -  2  due  to  the  fact  that  two  statis- 
tics are  calculated  from  the  sample. 

The  use  of  "t"  may  be  illustrated  with  the  correlation  of  ear  length  (x)  and  weight 
(y)  in  corn  (Par.  IV). 


t  =    rVN  -  2      =   0.937  -/gQdL-   =  11.38 
V  1  -  r2~       V  1  -  (0.937)*' 

In  the  "t"  table  it  is  noted  that  for  18  degrees  of  freedom,  the  value  of  t  required 
for  P  ■  0.05  is  2.101.  Thus,  the  above  value  is  judged  to  be  highly  significant. 
The  same  result  can  be  obtained  from  Table  VA  in  Fisher,  (193*0  . 

(°)  Difference  between  Correlation  Coefficients 

A  test  for  the  significance  of  differences  between  correlation  coefficients 
*tf»  b'ien.  suggested  by  Fisher  (193*0  as  follows: 

z'  =  1/2   [lege   U  +  r)  -  loge  (1-r)] , -  -  (8) 
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The  standard,  error  would  "be  as  follows 


oz,      =    1/VN  -  3    ~ .- -.--:-.___--_,-,,__  (9) 

The  method  may  he   illustrated  from  an  example  given  "by  Goulden   (1937)  who  studied  the 
relation  "between  the   carotene   content   of  wheat   flour  and  the  color  of  bread  for  139 
wheat  varieties.    ,  The.  correlation. coefficients  were  as  follows: 

Carotene  in  whole  wheat  with  crumb  color",   r-i      =      -0.^951  ■    '     • 

Carotene  in  flour  with  crumb   color,    rp     ='     -0.5791 • 

The   z'   test  would  be  applied  as  follows: 

z'l  =     1/2  [logg   (1  +  0.1+95D      -  loge   (1   ~  0.1+951)] 
=     1/2  [log©  1.^951  -  loge  O.5049] 

a     1/2  lege     2.9612     ==  0.5^28 


=     1/2   log 


t>e 


1.4951 
0.50^9 


•2  =     1/2  [loge   (1  +  0.5791)    -  loge   (1   -  0.5791)] 


/2  log 


I4I91  I  „     1/2  log     -3.7517     =     0.6612 
0A209  I  ',         ee  • 


5 A     =  z{       -     0.6612  -  0.5428    =  o.llSU 


°k'2    -   zN    =    /I  +     -J-       -     0.1213 

1     V 136         136 

dz'/o-zi     =     0.1184/0.1213     =     0.9761 

Since  the  difference   is  less  than  its   standard  error,    it   is  not   significant. 

The  formula  for  z'   deals  only  with  the  numerical  value  of  r,   no  attention  being  paid 
to  algebraic  signs.      It  may  be  noted  that  the   z  test  for  significance  of  r  is   superior 
to  the   devices  heretofore  described. 

VIII.  Interpretation  of  the  Correlation  Coefficient 

Certain  precautions  are  necessary  in  correlation  analysis.  First  of  all,  the  charac- 
ters of  the  individuals  under  consideration  must  be  paired  for  some  logical  reason. 
The  sample  should  also  be  representative  of  the  population.  Ordinarily  it  is  inad- 
visable to  calculate  correlations  on  numbers  where  N  Is  less  than  30.   Caution  should 
be  used  in  the  application  of  correlation  statistics  where  I  is  less  than  50. 

Spurious  correlation  is  a  condition  where  the  things  compared  are  not  causally  relat- 
ed, hut  which  are  related  to  a  third  cause.  ■ Frequently  there  Is  a  tendency  to  assume 
that  a  significant  correlation  coefficient  is  proof  of  a  causal  relation  between  two 
variables.  This  may  not  be  true.  Extreme  caution  should  he  used  in  inferring  cause 
from  a  correlation  coefficient . 

S  --  Linear  Regression 

IX .  Theory  of  Regression 

A  regression  is  said  to  be  linear  when  the  means  of  the  sets  of  values  of  one  charac- 
ter which  correspond  to  given  values  of  the  other  character  can  be  well -fitted  gra- 
phically by  a  straight  line.  Under  such  conditions  the  coefficient  of  correlation  (r) 
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is  a  valid  measure  of  association.  From  the  definition,  it  is  evident  that  there 
must  he  two  regression  lines.  They  are  termed  the  lines  of  regression  of  x  on  y,  and 
y  on  x. 


y-distri- 
hution 


x-distribution 
A Jc 


X 

■    ■  -             ,             - 

*\*                       V 

\                             1 

: \.    ) 

B 


This  diagram  should  give  a  clear  con- 
ception of  what  is  meant  hy  regression. 
The  elliptical  nature  of  the  scatter 
is  shown  with  the  dots  which  indicate 
the  means  of  the  individuals  in  each 
array,  fcoth  horizontal  and  vertical. 
The  means  of  all  the  rows  fall  approxi- 
mately in  a  straight  line,  as  well  as 

y    those  for  columns.  These  lines,  called 
the  regression  lines,  intersect  at  a 

D    point  which  indicates  the  means  of  the 
two  general  distributions,  x  and  y. 
The  mathematical  equations  of  these 
lines  can  he  obtained  by  the  method  of 
least  squares.  The  line  AB  is  the 
regression  of  x  on  y.  Its  equation  is 
as  follows: 


X  -  x  - 


sx  (y  -  y) 


H 


(9) 


Where  x  is  the  value  estimated,  say  xe . 

The  line  CD  is  the  regression  line  of  y  on  x.  Its  equation  is  as  follows: 


y  -  y 


=  r  _fy_  (x  -  x) 


(10) 


Where  y  in  the  val\ie  estimated,  say  ye . 

These  equations  may  be  used  to  predict  or  estimate  the  most  probable  value  of  one 
character  to  accompany  or  be  associated  with  a  given  value  of  the  other  character. 
When  a  certain  value  is  given  y  in  the  equation  xe  "  x  =  s'x/s'y  (y  -  y) ,  one  can 
solve  for  the  predicted  value  of  x  that  corresponds  to  it.  Likewise  when  a  value  is 
given  to  x,  in  ye  -  y  =  s'y/s'x  (x  -  t) ,   the  most  likely  value  for  the  y  that  accom- 
panies it  can  be  found.  Predicted  values  given  by  the  regression  equations  are  con- 
servative. Actually,  the  term  "regression"  is  a  result  of  this  tendency.  These 
equations  have  little  or  no  value  for  prognosis  unless  r  is  quite  strong.  The  two 
diagrams  below  depict  predicted  values  given  by  regression  equations  in  two  case 3, 
i.e.,  where  the  correlation  is  strong  and  where  it  is  weak. 


A   x — »x 


Cane  I 
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In  each  case,  observe  the  same  given  value  of  x  indicated  "by  the  point  5c,  on  the 
upper  line  of  each  diagram.  The  vertical  line  from' the  point,  x,  to  the  line  CD- 
measures  the  predicted  value  of  y.  The  portion  (A  B)  illustrates  the  amount  .of"  ab- 
normality predicted.  This  is  seen  to  "be  much  smaller  in  case  II  where  the  correla- 
tion is  weak.  Moreover,  the  standard  error  of  an  estimated  value  is  so  large  that, 
unless  "r"  is  high,  the  reliability  of  an  estimated  value  is  small.  Although  a  sin- 
gle predicted  value  is  of  little  avail  unless  a  very  high  degree  of  association  exists 
between  two  characters,  the  regression  measured  by  the  coefficients  r  b-^  /   e£  and 
r  a j  /  si  ^sy  be  quite  appreciable  i.n  one  case  or  the  other,  even  when  r  is  small. 
This  is  due  to  the  fact  that  variation  in  one  character  may  be  quite  low.  For  in- 
stance, the  association  between  the  yield  of  a  crop  obtained  from  several  plots  and  a 
certain  treatment  given  in  various  degrees  of  intensity  to  the  plots  may  be  quite 
low.  The  first  reaction  would,  be  that  the  treatment  is  not  justified.  However,  the 
regression  might  be  appreciable,  so  that  the  treatment  might  be  very  worth  while  for 
the  crop  as  a  whole . 

The  more  important  Interpretation  of  a  px"edicted  value  from  a  regression  equation 
lies  in  the  fact  that  it  may  be  considered  as  a  mean  estimated  value  of  the  variable 
which  may  be  expected  to  result  in  connection  with  a  number' of  identical  values  of 
the  second  variable.  Such  an  estimated  mean  would  have  a  standard  error  =  js^ where 

m  =  number  of  repeated  cases  of  the  second  variable,  and  se  is  the  standard  error  of 
regression  (See  Section!  (c)  below). 

X.  Computation  of  Regression  Equations  for  Grouped  Data 

The  equation  for  the  regression  coefficient  is  as  follows: 


bVT  =  SY(X  -  1) 
<y"   S  (X  -  X.)'< 

The  most   convenient  formulae  for  machine  computation  are  as  follows 


VT  =  ox  l^.._.r_AJ ____      _  _  _'  .  ________  _____(  1 1) 

S  (X  -X.)2  U±j 


.  s(x2)  -  (sx)7n       ,"  "  "  '     '"  "       K~  ' 

or 

V  =  TIS(XY)   -   (SX)(SY) _ (13) 

NS(X2)  -  (SX)2 

Where  the  transformed  variables  (X,Y)  are "not  used,  the  same  relations  held  in  terms 

of  the  original  variables  (x,  y) . 

i     •     ' 

( a )   Computation  of  Regression  Coeffi clents 

The   computation  may  be   illustrated  for  the   correlation  between  total   grain 
weight   and  average  lengths   of   culms   in  wheat  plants    (Paragraph  V  above) .      Calculations 
from  Table  2,    which  can  be  used  here,    are  as  follows  for   coded  data: 

SXY     =      21,1+09  SX     =     2736  SY     =      3772 

X     a      5.538e  1     =     7.6356 

N     -  I+9I+  SX2  =   16,930  SY2     =   31,108 

By   substitution  in  formula  13: 

bvx     =     JBlXYl^lSXKSYL      =      (W  (2H109)    ~   (2736)  (3772) 
JX  m   (X2)    -  (pj)2  (W)  (16, 950)    -   (2736)2 

=     10,576,01+6   -  10,320,192       =  2^5,851+     =  0.2913 
8,363,1+20  -  7,]+35,696  877,721+ 
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D    =  M5(XY)  -  (SX)(SY)    =  (I^IQ  (21^09)  -  (2756)(3772) 
BS  (Y*)  -  (SY)a         (W( 31108)  -  (3772)2 

=      255, 85^ ,  255, 85^-    =  0.2246 

15,367,352  -  lk,227,9Qk  1,139,368 

(h)  Substitution  in  Regression  Equation 

The  equation  for  the  regression  of  Y  on  X  is  as  follows: 

Ye  =  I  -  b^  i  +  hyx  X 

=  7.6356  -  (0.2915)  (5.5385)  +  (0.2915)  x 

=  6.0211  -'0.2915X^/ 
The  equation  for  the  regression  of  X  on  Y  is  calculated  in  a  similar  manner. 

X@  =  X  -  hxy  Y  -'-  bxy  Y 

=  5.5385  -  (0.2246) (7.6356)  +  (0.2246)  Y 
=  3.8235  +  0.2246  Y 

(c)  Significance  of  Regression  Coefficients 

The  "t"  test  for  significance  of  the  regression  coefficient,  hyx  =  0.2915 
(coded  "basis)  can  he  determined  as  follows: 

S  (Y  -  Ye?      =  S  (Y  -  Y)2  -  h2x   S  (X  -  X)2 

=  SY2  -  Nf2  -  h2x   (SX2  -  H£2) 

=  31,108  -  (494)(7.6356)2  -  (0.2915)2[  16,930  -  (W(5o385)2] 

=  31,108  -  28,801.3856  -  0.0850  (16,930  -  15,153.^500) 

=  2,306.6144  -  (O.0850) (1,776.55)  =  2155.6076 


se  = 


S(Y   -  Y)2      -  b2         S(X   -  X)2  =        /2155.6076       =     2.0932 

F~rp -v     ^92 


byy     j   S       (X        -     X)2  =  0.2915       7l776.55  =  5.87 


se  2.0932 

This  indicates  that  the  regression  coefficient  is  highly  significant.  The  coefficient, 
t,x  =  0.2246,  can  he  tested  in  a  similar  manner. 


■"-The  coded  values  are  changed  into  actual  values  hy  the  conversion  of  Y  to  y,  X  to  x, 
and  the  multiplication  of  hyX  hy  C„/Cx  as  follows: 

y  =  (Y  -  1)  Cy  +  yx   =  (7.6356  -  1)(5)  +  62  =  95-1780 

x  =  (X  -  1)  Cx  +  xx   =  (5.5385  -  1)(20)  +  9.5  =  100.2700 

hyx  =  (0.2915) (5/20)  =  0.0729 

ye  =  y  -  hyx  x  +  hyxx  =  95.1780  -  (0.0729) (100.27)  +  (0.0729)x 

=  87.8685  +  0.0729x. 
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Quest  i  oris  f  or_  Bis  cue  si  on 

1.  Define  correlation. 

2.  What  is  a  scatter  diagram?  How  is  it  influenced  by  high  correlation?  Low 
correlation? 

3.  When  are  two  variables  said  to  be  correlated?  riot  correlated? 

k.   What  is  the  generally  accepted  method  for  the  measurement  of  correlation?  Its 

limitations? 
3.  What  is  meant  by  r  =  +  1,  r  =  -  1,  and  r  =  0? 

6.  How  can  the  standard  error  of  the  difference  bo  reduced  by  the  use  of  correlation'; 

7.  Why  is  the  "t"  test  preferable  to  the  standard  error  test  for  testing  the  signi- 
ficance of  r? 

8.  What  precautions  must  be  exercised  in  the  interpretation  of  the  correlation 
coefficient?  Why? 

9.  Under  what  conditions  is  "r"'a  valid  measure  of  paired  relationships? 

10.  What  is  regression?  Its  use? 

11.  Explain  what  is  meant  by  the  regression  of  y  on  x.  Regression  of  x  on  y. 


Problems':  .. 

1.  These  data  were  collected  to  study  the  relationship  between  the  soil  moisture  con- 
tent and  the  yield  of  wheat  (Data  from  Salmon) : 

Mo i st ure ( x )  Y i e 1 d ( y )  Mo i st ur e ( x )  Yield (y)  Meisture(x)  Yleld(y)  Moisture (x)  Yleld(y) 

21        1  25        3d        18  10  le  3 

17         1  23       •  2k                  21  28  15  k 

17  3  26       39      21  28  18  8 

18  3  18  0  22  25  17  12 
21  21  18  .0  23  29  17  13 
20        2k  18          0        17  0  17  16 

19  20  18  0  16  7  16  15 
19  7  18  3  15  o  19  11 
17  19  19  7->  19  9  19  10 
16  21  15  11  18  23  18  k 
16  21  15  9  18  23  21  k 
16  20  13  9  18  27  -  -13  36 
2^  32  15  9  18  23  26  k'J 
2k                  37  18        13        jl°  3 


(a)  Calculate  the  coefficient  of  correlation  (r)  by  the  machine  method  for  ungrouped 
data,   (b)  Test  the  significance  of  "r"  by  the  "t"  test. 
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2.  The  average  length  of  culms  and  the  average  diameter  of  culms  was  measured  on  k$6 
wheat  plants  at  the  Colorado  Experiment  Station.  The  data  follows: 


Av. 

Diameter 
culms 
(mm. ) 


Avera 

ge  length  of 

culms  (cm.) 

:  x 

60 

65 

70 

75 

80 

85 

90 

95 

100 

105 

110 

115 

y 

6k 

69 

7^ 

79 

8k 

89 

9k 

99 

10U 

109 

llU 

119 

91 

-100 

1 

1 

101 

-110 

1 

2 

111- 

-120 

1 

1 

1 

121- 

-130 

1 

k 

2 

1 

2 

0 

1 

131- 

-1^0 

1 

6 

3 

3 

2 

114-1- 

-150 

2 

3 

1 

6 

lv 

2 

1 

151- 

-160 

2 

5 

5 

16 

27 

8 

7 

1 

161. 

-170 

1 

3 

8 

13 

23 

25 

26 

6 

2 

1 

171- 

-180 

3 

1 

9 

10 

15 

26 

13 

8 

8 

181  • 

-190 

1 

9 

10 

10 

23 

32 

1+ 

3 

191- 

-200 

2 

2 

8 

13 

20 

12 

1 

201- 

-210 

2 

2 

3 

5 

k 

1 

211- 

-220 

2 

Calculate  r  and  test  it  for  significance  with  "t"  test. 

3.  The  correlation  between  the  reaction  to  Helminthosporium  in  F3  and  F5  barley  lines 
was  studied  at  the  Minnesota  Experiment  Station.  The  reactions  are  given  in  per- 
centage infection  for  1921  and  1922.  The  data  follow: 


Percentage  in  1921 


Percentage 

in 

1922 


12 

15 
13 
21 
2k 
27 


9 

12 

15 

18 

21 

2 

2 

2 

1 

1 

3 

2 

3 

1 

1 

2 

3 

2 

1 

1 

1 

k 

2 

1 

1 

Calculate  the  coefficient  of  correlation  and  the  regression  lines, 
regression  lines. 


Plot  the 


k.   The  9-year  average  yields  of  wheat  for  the  period  1921-29  were  as  follows  when 
irrigated  at  the  germination  and  filling  stages.  Five  plots  were  averaged  each 
year.  The  data  follow  in  grams  per  plot: 
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Year 

Grown 

Genuine 

-it  Ion 

(oS) 

192.1  ' 

511  ± 

*7 

1922 

655  * 

21 

1925 

91A  t 

52 

1924 

952  t 

23 

1925 

1+70  ± 

16 

1926 

557  - 

29 

1927 

783  ± 

20 

1928 

.- r ■"     Jr 

066  - 

20 

1929 

756  ± 

65 

Filling 
(gra.)(c£) 


•321 

23 

518 

~ 

17 

733 

i 

26 

125 

4. 

3k 

538 

± 

18 

601 

-f- 

31 

812 

i 

511 

•f 

18 

733 

z 

63 

Means         -  685  ±  11  6'57  *  11 

Determine  whether  or  not  the  wheat  irrigated  at  germination  differs  significantly 
in  yield  from  that  irrigated  at  /joint in&  and  tillering.  Calculate  oc]  of  an  average 
of  a  difference  "by  the  formula  07,  -  J  a^  ■*•  h1^  ,  and  by  the  extended  formula  for  use; 
of  r. 

Compare  d/a&  for  both  formulas. 


CHAPTER  X 

THE  ANALYSIS  OF  VARIANCE 

I.  Generalized  Standard  Error  Methods 

The  "basis  and  purpose  of  all  statistical  methods  is  to  analyze  and  measure  variabili- 
ty. Variation  between  observations  may  be  due  to  one  or  more  recognizable  causitive 
factors.  In  addition,  in  all  statistical  work,  there  occur  variations  between  obser- 
vations that  result  from  the  coalition  of  a  large  aggregate  of  chance  factors  which 
defy  control.  This  latter  type  of  variation  between  observations  results  in  various 
types  of  statistical  distributions  when  it  is  attempted  to  describe  homogeneous  popu- 
lations. 

Suppose  one  considers  a  population  wherein  the  variability  may  be  due  to  both  the 
combinations  of  innumerable  chance  factors  and  the  non-homogeneity  of  the  population. 
In  other  words,  the  population  naturally  and  logically  submits  to  sub-division  into 
several  homogeneous  groups  or  sub -populations.  Such  a  situation  is  common  in  variety 
tests  in  field  experimentation.  Generalized  standard  error  methods  have  been  devised 
for  data  of  this  kind. 

The  purpose  of  the  generalized  standard  error  methods  is  to  compute  the  standard  error 
of  an  entire  experiment  in  order  to  increase  the  accuracy  of  the  estimate  of  error. 
In  a  variety  test  where  each  variety  or  treatment  is  replicated,  say  four  times,  the 
reliability  of  the  results  would  be  very  low  were  one  to  compute  the  standard  error 
for  each  variety  separately.  However,  the  estimate  of  error  would  be  much  more  re- 
liable when  computed  on  10  different  varieties,  each  replicated  say  four  times,  in 
the  same  experiment.  In  this  case  a  total  of  hO   plots  would  contribute  to  the  esti- 
mate of  error  instead  of  four. 

The  analysis  of  variance,  developed  by  B.  A.  Fisher,  has  proved  to  be  the  most  precise, 
flexible,  and  readily  usable  method  available  for  the  analysis  of  the  results  from 
field  and  many  other  biological  experiments.  It  consists  essentially  in  the  -partition 
and  apportionment  of  the  total  variation  tc  lbs  known  causes  with  a  residual  portion 
ascribable  to  unknown  <:■:"   uncontrolled  variation  and  therefore  called  experimental 
error.  When  the  variability  is  measured  in  suitable  terms,  i.e.,  sums  of  squares  of 
deviations  about  the  means,  the  variability  ascribed  to  the  various  causes  will  be 
strictly  additive.  The  calculations  are  therefore  extremely  simple.  The  mean  value 
of  the  sums  of  squares  (mean  square,  variance,  or  standard  error  squared)  is  found  by 
division  of  the  sums  of  squares  by  the  appropriate  number  of  degrees  of  freedom. 

The  literature  on  the  analysis  of  variance  has  become  very  extensive  during  the  past 
15  years.  It  was  first  set  forth  in  its  complete  form  by  Fisher  and  MacKenzie  in 
1923-  For  its  application  to  field  experiments,  Fisher  (193*0 }   and  Fisher  and  Vis- 
hart  (1930)  have  given  excellent  discussions.  Among  other  sources  of  information  on 
the  application  of  the  analysis  of  variance  to  field  experiments  may  be  mentioned  the 
books  by  Snedecor  (193^  and  1937);  end  Tippett  (1937),  and  papers  by  Eden  and  Fi3her 
(1929),  Goulien  (1931),  Immer,  et  al  (193*0,  and  Wishart,  (1931).  For  a  summary  of 
the  mathematical  theorems  involved  in  the  analysis  of  variance,  the  work  of  Irwin 
(1931)  is  recommended. 

A  —  One  Criterion  of  Classification 
H.  Theory  of  First  Special  Case 

Suppose  a  sample  is  formed  from  the  general  population  with  random  samples  of  equal 
size  taken  from  each  of  the  eub -populations.   In  case  m  sub-samples  contain  n  measure- 
ments each,  the  total  sample  will  contain  N  =  nm  measurements.  It  is  how  proposed  to 
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analyzc  the  total  variance,  I.e.., 

^  =    sj_l__j___;.      -  ..  .  -  -  — —  -  .  -  - .  (i) 

N  -  1  -.■■.;. 

where  x  Is  the  mean  of  the  total  sample.  Let  x  represent  an  individual  measure  of 
any  (irth).:  sub -sample.  Then, 

x  -  x  a  (x  -  %)  +  (X1  -  x)   -  - _  _  _  .. .  .  (2) 

where  £]_  is  the  mean  of  the  i-th  sample. 

First,  the  above  identity  should  be  squared  and  suxamed  for,  all  the  n  individuals  which 
form  the  i-th  sub-sample.  The  symbol  8'  will  be  used  for  this  summation. 

S'  (x  -  £..)*£  S'  (x  -  Xi)2  +  2(%  -  x)  S'(x  -  %)  +  nCxj  -  x)2  -  -  -  -  (3) 

Since  S'(x  -  x^ )  =  0,  it  is  evident  that  the  second  term  on  the  right  vanish- 
es. 

m 
Now  suvvpose  one  simis  over  all  the  m  different  sub -groups  by  use  of  the.  symbol,  S. 

TYJ-  _  Oil'  ^  J  -J 

The  combination.,  SB'  is  simply  S,  or  summation  for  all  individuals  of  the  total  isam- 
'  1 


Pis 


m   .     v  o    la   , '      .  o     m  .  _ '    .0 


S  (x  -  x)  =  SS«  (x  -  ±)d   =  SS  •  (x  -  %)-  +  n  S  (x3  -  x)'-  -  -  -  -  -  (k) 

The  term  on  the  left  is  the  siim  of  the  squares  of  the  deviations  of  the  individual 
observations  from  the  means  of  the  sub-samples .  The  second  term  on  the  right  is  n 
times  the  sum  of  squares  of  the  deviations  of  the  means  of  the  sub-samples  from  the 
mean  of  the  total  sample. 

The  computation  of  these  three  terms  is  most  easily  accomplished  as  follows: 

(1)  Compute  the  term  on  the  loft, 

S(x  -  x)2  „  Sx^  -  (Sx)2    -  -  _..__.-__. _  ~  (5) 

N 

(2)  Next,  the  second  term  on  the  right, 

n  S(%  -  x)2  =  n  S  S,2  -  (So:)2  =  S(xa2  }  -  (Sx)2  ------ (6) 

1  1        N        TT    — r- 

whore  xa  Is  an  abbreviation  for  S'(x),  the  total  of  the  Variates  in  a  single 

sub -sample . 

(3)  The  other  term  may  be  found  by  mere  subtraction. 

The  difficult  thing  to  explain  comes  at  this  point.   It  would  be  easy  to  merely 
apportion,  for  the  sample  in  question,  the  total  variance  into  the  variance  within 
sub-samples  and  into  that  between  sub- samples.  Those  two  respective  variances  could 
be  obtained  by  division  of  the  first  and  second  terms  on  the  right  of  the  identity  b,y 
N.  However,  the  real  desire  is  to  obtain  the  best  estimate  to  the  variance  of  the 
population  as  exhibited  within  the  sub -populat ions  on  the  one  hand,  and  between  the 
sub -populat ions  on  the  other. 

The  best  estimate  of  the  total  variance  of  the  general  population  is  given  by :' 

s2  *  =  SLx^  x)2 ...--.-.. -  -  ...  (T) 

N  -  1 
Where  K  -  1  (or  nm  -  I)  is  the  number  of  decrees  of  freedom. 

Likewise,  the  best  estimate  of  the  variance  within  cub -populat Ions  (replicates 
agronomically)  will  be: 

O    xl-       m.     .      „  ,0  •       /n\ 

V  1      __  ~_  1  ' 

'   if-  m" 


NJ/ "is   estimated  by"   in  this   sense 
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Where  N  -  m  =  m(n  -.  1)  la  the  number  of  degrees  of  freedom.  This  is  true  "because  m 
separate  means  of  the  m  sub-samples  were  used  in  the  computation. 

This  expression,  s2r  =  the  variance  within  sub-samples,  is  often  called  the  residual 
variance . 

The  last  term,  n  5  (x^-  x)2,  (Equation  No.  k   above),  must  now  he  considered. 

At  first  glance,  it  would  seem  that  the  sum  of  squares  of  the  deviations  of  the  means 
of  the  sub -samples  from  the  mean  of  the  total  sample  when  divided  by  m  -  1,  the  number 
of  degrees  of  freedom,  would  give  a  proper  estimate  of  the  variance  between  sub-popu- 
lations. However,  xj_  does  not  represent  the  mean  of  the  i-th  sub -population,  but 
rather  the  mean  of  the  i-th  sub-sample.  Therefore,  the  difference,  x±-   x,  is  due  to 
the  combination  of  (1)  the  inherent  nature  of  the  i-th  sub -population  and  (2)  sampling 
fluctuations  within  the  i-th  sub-sample.  Thus,  n  S(xj  -  x)2  can  be  interpreted  to 

m  -  1 
estimate  the  sum  of  ns^2,  n  times  the  variance  of  the  means  of  the  sub -populations, 
and  sr  ,  the  variance  within  sub -populat ions . 

It  should  be  remarked  that  the  expression, 

sr2  =  SS '  (x  -  Xj)^,  called  the  "variance  within  sub -samples",  is  simply 

N  -  m 
one  estimate  of  the  variance  of  the  total  population. 

m  _    -«2       ?    o 

The  term,  n  S  (xj  -  x)   =  n  s£  +  s£,  called  the  "variance  between  sub -samples",  is 

1  m  -  1 
n  times  the  variance  of  the  sub-sample  (treatment)  means  about  the  total  sample  mean 
with  st,  added.  On  the  assumption  (null  hypothesis)  that  the  true  variance  of  the  sub- 
population  means  about  the  total  population  mean  is  zero,  it  then  becomes  clear  that 
"variance  within  sub-samples"  and  "variance  between  sub-samples"  are  both  independent 
estimates  of  the  same  concept,  i.e.,  variance  of  the  total  population. 

The  material  may  be  placed  in  tabular  form  for  clarity: 

Source  of  Sums  of 

Variation  Squares 


m  p 

Between  Sub-samples  nS(xi  -  x) 

m  -     o 

"Within  sub -samples  SEP  (x  -  x^) 

Total  S(x  -  x)2 


Degrees 
Freedom 

of 

Estimated  Mean 
Variances 

m  -  1 

n  Bt 

2   +  a/~ 

-~H2 

N  -  m 

°r2 

•  V 

N  -  1 

8a 

=  v2 

The  first  two  entries  in  the  last  ccl'^ian  may  be  examined,  i.e.,  the  estimates  given 
by  the  sample.  These  are  ns-j-2  +  BJ£ .  it  is  obvious  that  the  fjrst  should  exceed  the 
second  unless  s-^2  is  zero.  >2/ 

It  is  now  desired  to  determine  whether  or  not  there  is  a  significant  variation  be- 
tween the  sub -populat  ion,  i.o..,  whether  c^-2  is  significantly  different  from  zero.  An 
estimate  of  o^2  may  be  made  by  subtraction  of  the  estimate  of  crr2  from  that  of  nc=t2- 
This  result  is  then  divided  by  n. 


^Occasionally  the  reverse  is  true.  This  apparent  contradiction  is  explained  by  the 
fact  that  the  results  are  merely  estimates  which  may  be  distorted  to  whatever  extent 
sampling  fluctuations  may  account . 
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So  far  it  is  obvious  that  tho  first  step  in  the  analysis  of  variance  serves  two  pur- 
poses:  (l)  It  gives  a  method  to  test  the  homogeneity  of  a  population;  (2)  It  gives 
a  convenient  method  to  test  tho  dif f erencos  "between  several  meant-;  as  a  whole. 

Probably  the  best  method  to  test  for  association  between  sub -populations  is  through 
the  use  of  the  "z"  index  devised  ~by  Fisher  (193*0  •  This  affords  a  test  of  signifi- 
cance between  two  variances,  e.  g*,  &y~   mid  $2  - 

7.     ■■'.     i  los    Bi'11-         -  1  1  act  Firs'-        ■-      «  1  O'.v     ,q-i'-     ___..__.__.._.-_,__  (  O) 

— p 

t. 

The  "z."  table  devised,  by  Fisher  (193^0  niay  ho  used  to  test  these  values  for  signifi- 
cance through  use  of  the  number  of  degrees  of  freedom  pertinent  to  each  computed 
variance.   In  this  case,  noj.-  +  erg-  takes  the  place  of  s-g-;  while  crrc  takes  tho 
place  of  So"-. 

Tests  of  significance  may  also  be  made  Xrj   means  of  the  "F"  test  derived  by  Mahalono- 
bis  (1952)  and  by  Snedecor  (193*0 ( 1937 ) •  The  table  by  Snedecor  is  the  more  extensive 
The  value  "F"  is  the  quotient  obtained  by  division  of  the  larger  ^y   the  smaller  var- 
iance. The  "F"  and  "z"  tests  arc;  equivalent  since  z   =  -   loggF. 

III.  Computation  for  Single  Criterion  of  Classification 

This  case  may  be  illustrated  with  some  data  for  the  yields  of  two  barley  varieties, 
(See  Chapter  o)  .  The  yields  in  bushels  per  acre  for  the  G-labron  and  'Velvet  varieties 
grown  in  single  plots  on  12  Minnesota  farms  were  as  follows:  (Data  from  F.  K.  Immer) 

Velvet  ( ro )  Total 


Farm  No. 

Glabron  (xi) 

1 

IlO, 

2 

kl 

3 

39 

11 

37 

5 

k6 

0 

52 

7 

51 

8 

37 

Q 

k? 

10 

k? 

11 

'4-6 

12 

6k 

Totals  (Sx) 

-3o 

Means  (x) 

■'+3. 

3333 

S  (X2)  r,   p0] 

399 . 00 

kl 

-7r*- 
30 

32 

kl 

Vi 

)+o 

36 

k2 

39 

k'( 

30 

309 
V2A167 

(§x)2  - 

k9. 

k 13 .38 

N 

91 

0i|. 
i  / 

60 

87 

93 
Q6 
113 
37 
Sk 

93 
10:; 
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Suppose  that  the  total  variability  is  separated  into  two  components,  viz.,  that  "due 
to  varieties"  and  that  due  to  variation  between  plots  of  the  ssme  variety.  The  ex- 
pression "due  to  varieties"  simply  means  that  it  is  proposed  to  make  varieties  the 
criterion  for  the  break --down,  of  the  total  sample  into  sub-samples. 

The  sum  of  squares  for  total  variation  is  found  by  summation  of  the  squares  of  the  211 
plot  yields,  e.g.  ('+9);-  +  (i+7)   +  J-  (.39)  =  30,599?  and  the  subtraction  of 

the  correction  factor  (SxV-/N  from  this  value. 
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Algebraically,   this  is  given  "by: 

Total  ssS^fx  -  X)2  =  S(x2)   -  (Sx)2/N  =  50,599-00  -  49,413.38  =  1185.62 

The  sum  of  squares  for. varieties  is  obtained  "by  the  summation  of  the  squares  for  the 
two  totals  for  varieties,   dividing  "by  number  of  plots  or  values  contained  in  each 
variety  total,   and  subtracting  the  correction  factor,   e.g. 

Between  Varieties     =     n£>     (x^   -  x)2     =  SXy2   -  (Sx)2 

1  "~n~   T~ 

=  595, 481. 00    -  49,413.38 
12 

=  4Q,623.42  -  49,413.38  =  210.04 

The  sum  cf  squares  for  within  varieties,  here  used  as  error,  is  the  remainder  after 
subtraction  of  the  sums  of  squares  for  varieties  from  the  total,  e.g.  H.85.62  — 
210.04  =  975.58. 

The  analysis  of  variance  follows: 


Variation 
due  to 

D.F. 

Sum 
Squares 

Mean 
Square 

Standard 
Error(s) 

z -value 
obtained  5  pet .  point   F 

Between  Varieties 
Within  Varieties 

1 
22 

210.04 
975.58 

210.0400 
44.3446 

6.6592 

0.7777  0.72Q4    4.737 

Total  23    1185.62 

The  degrees  of  freedom  for  "between  varieties"  and  "total11  are  one  less  than  the 
number  of  varieties  and  total  number  of  plots,  respectively.  The  degrees  of  freedom 
for  within  varieties  are  those  for  a  single  variety  (11  in  this  case)  multiplied  by 
the  number  of  varieties  (2  in  this  case),  or  (2) (11)  =  22. 

The  mean  squares  are  obtained  by  division  of  the  sums  of  squares  by  their  respective 
degrees  of  freedom.  The  standard  error  of  a  single  determination  (s)  is  the  square 
root  of  the  mean  square  for  error  (or  variance) . 

The  z-test  may  be  used  to  test  the  significance  for  variance  "between  varieties"  and 
that  "within  varieties".  The  value,  z,  is  l/2  loge  of  the  difference  of  the  variances 
to  be  compared.  The  values  of  the  logarithms  needed  in  computing  z  are  found  in  a 
table  of  natural  logarithms  (See  Table  4,  appendix). 


V  =  1/2  loge    210.04  -  1/2  loge  44.3446 

=  1/2  loge   f  210.0400 \=  1/2  log0  4.737  *  0.7777 


N^S  (x  -  x)2  =  Sx2  -  2x  S(x)  +  Nx2 

=  Sx2  -  2x«Nx  rf  Nx2 
=  Sx2  -  Nx2 

Since  Nx   =  Sjfx) 

Sx2  -  Nic2  =  Sx2  -  S(x)x  =  Sx2-  (Sx)2/U 

The  decimal  point  may  bo  moved  to  the  left  on  the  mean  square  values  to  shorten- 
the  work,  so  long  as  the  resultant  numbers  arc  greater  than  1.0.  The  true  loge 
values  will  not  be  obtained  but  tho  difference  of  z -value  is  unaffected.  A  shift 
cf  decimals  is  particularly  desirable,  when  any  of  tho  mean  squares  are  less  than 
1.0  to  avoid  taking  a  negative  log,* . 
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The  theoretical  z  -value  .is  looked  up  in  the  table  given,  by  Fisher .  (193M-*  where  N]  in 
the  number  of  degrees  of  freedom  for  the  larger  :and  Njb  '^ie   degrees  of  freedom  for  the 
smaller  variance.  In  this  case  z   -  0.729'+  for  the  theoretical  value  for  the  j   per 
cent  point.  However,  the  interpretation  is  made  more -easily  by  using  the  "F"  value. 
The  "F"  value  is  the  quotient  of  the  larger  by  the  smaller  variance,  e,g.,  F  =  210. 0U/ 
kk.jkkS   =  k.'jk.     In  Snedeeor's  table  (Table  2,  Appendix)  for  Nn  -  1  D»F,  and  N2  = 
22  D.F.,  it  is  found  that  the  observed  "F"  lies  between  the  J.O  per  cent  and  1.0  per 
cent  points.  The  theoretical  value  for  the  S.O  per  sent  point  is  U.JO.  It  may  be 
noted  that  F  =  f--  for  one  degree  of  freedom. 

IV.  The  More  General  Case 

Suppose  that  the  number  of  observations  in  each  sub-sample  varies,  and  that  they  are 

represented  by  n.] ,  n.g n,-,.  Then  F  =  Sn± ,     The  equation  for  the  sample  is  as 

follows: 

S(x  -  %)d--     SS'(x  -  x)c-   =  SS'(x  -  Xj)d  •+  Sn;(xi  -  x)<:' (10) 

1  j.  1 

m      _  o 
Again,  SS  (x  -  x^)  divided  by  II  -  m,  the  degrees  of  freedom,  will  give  the  variance 

within^  a  group.  However,  it  is  now  impossible  to  arrive  at  an  estimate  of  s+g-  be- 
cause Si:u  (x_-|  -  x)^  is  affected  by  the  different  number  of  observations  in  each  sub- 

1  ■  o  Irl    /        \  o 

sample.  Therefore  suppose  that  &*.<•   is  trulv  aero  so  that  S  n-:(x.-  -  x)^  will  estimate 

,.-J  ■'■  -  ll  -  -I  X  *      J. '__ 

r  •  m  -  1 

This  assumption  may  be  tested  for  the  existence  of  an  association  between  sub-popu- 
lations.  To  do  this,  the  valuer:  SB'   (x  -  xi)r~  and  £>  ni(xi  •-  xf  are  compared  for 
a  significant  difference.         '-  if  -  m  -------- 

In  the  field  of  agronomic  experimentation  this  situation  is  rarely  found  because  the 
experiments  are  designed  to  permit  a  simpler  set-up  for  the  computation  of  the  statin- 
t  i  cal  constant  s . 

B  --  Two  or  More  Criteria  of  Classification 

V .  Theory  of  the  Extended  Case  of  the  Analysis  of  Variance . 

Frequently,  the  complexity  of  the  experiment  that  affords  the  data  makes  it  necessary 
to  analyze  the  total  variance  into  more  than  two  parts  in  order  to  make  the  most  of 
the  possibilities.  First,  re-examine  the  tabular  arrangement  for  the  first  special 
case,  (Paragraph  II) .  Suppose  the  classification  of  the  total  population  into  sub- 
populations,  which  forms  the  basis  of  the  above  analysis,  be  termed  classification 
"A".  Now  suppose  the  total  population  lends  itself  to  an  independent  classification, 
"B".  which  contains  "in'1'  classes.  For  simplicity,  assume  that  the  sample  sub-divides 
evenly  for  this  classification' with  "n"'  observations  in  each  class.  Thus,  N  =  nm  = 
n'm' . 

Previously,  the  heterogeneity  in  the  total  population  for  classification  "A"  was  test- 
ed "oy   a  comparison  of  V+2  with  12  .2 .   It  was  necessary  to  tacitly  assume  that  each 
sub -population  for  classification  "A"  was  homogenous.  Now,  if  the  population  submits 
to  a  new  classification  "B",  it  is  quite  likely  that  the  original  sub -populations  were 
not  homogenous  if  classification  "B':  has  any  logical  basis.  Lack  of  homogeneity  in 
the  sub -populations  increases  the  variance  therein.  The  residual  variance,  Vpg-,  may- 
be so  affected  in  the  comparison  between  Vjv-  and  'vie-  that  the  differences  between 
groups  for  classification  "A'''  may  appear  to  be  insignificant  when  the  opposite  is 
true .  ■:  •. 
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Therefore,  It  is  proposed  to  remove  from  the  squared  residual  errors,  SS '  (x  -  x^)2, 
the  sum  of  the  squared  errors  between  groups  for  classification  "B".  ^This  amount 

will  he  termed  n'  a  (x<  -  xjr,  It  represents  m'  -  1  degrees  of  freedom,  while  x< 

j=l  J  d 

indicates  the  mean  of  the  j-th  class  of  classification  "B".  The  mean  variance  "between 
groups  for  classification  "B"  will  he  designated  as  V^,2. 

At  first,  one  might  expect  the  mean  residual  variance  (Vp2)  to  be  definitely  reduced 
in  this  manner,  regardless  of  any  justification  for  classification  "B".  This  is  not 
true  because  the  reduced  sum  of  the  residual  squared  errors  now  represents  only 
N-m-m,+  1  degrees  of  freedom  where  N  -  ra  degrees  of  freedom  were  represented  before. 
Thus,  it  is  apparent  that  the  new  Vr2  will  not  differ  sensibly  from  its  former  value, 
should  the  differences  between  groups  be  insignificant  for  classification  "B".  How- 
ever, the  greater  the  significance  of  the  differences  between  the  groups  for  classi- 
fication "B",  the  more  markedly  Vp2  will  be  reduced.  Then  the  ratio  V^/^will  be 
sensibly  increased,  with  the  result  that  the  test  for  significance  of  differences 
between  groups  for  classification  "A"  is  strengthened.  The  new  tabular  arrangement 
of  the  analysis  is  as  follows: 


Source  of 
Variation 


Between  groups 
(A) 

Between  groups 
(B) 

Residual 

Total 


Sum  of  Squares 


n  S  (xi  -  x)2 

1 

&'  /      *? 
n'  S  (x*  -  x)c 

1   J 


S  (x  -  x)2 


Degrees  of 
Freedom 


m  -  1 


m1   -  1 


N  -  m  -  m'   +  1 


K  -  1 


Mean 
Variance 


V, 


tr    2 


fB 


V2(or  s2) 


The  entry  for  the  sum  of  squares  of  the  residual  errors  is  left  blank  because,  in 
computation,  it  would  be  found  by  subtraction. 

This  process  may  be  extended  in  the  same  manner  to  take  into  account  other  possible 
classifications  which  might  contribute  to  the  heterogenous  character  of  the  original 
population.  The  object  is  for  the  residual  variance  to  represent  variance  due  to 
chance  alone  as  nearly  as  possible.  Furthermore,  an  increase  in  the  scope  of  an  ex- 
periment will  proportionately  increase,  to  within  differences  due  to  sampling  fluc- 
tuations, all  the  sums  of  squared  deviations  incorporated  into  the  analysis.  Ilence, 
V^  and  V-t«  will  be  increased  proportionately  since  the  number  of  degrees  of  free- 
dom they  respectively  represent  are  unchanged.  The  value  V^2  will  be  increased  to  a 
lesser  extent  due  to  the  fact  that  the  number  of  degrees  of  freedom  represented  will 
be  more  than  proportionately  increased.  Thus,  V^/Vg  and  V^,/VE  will  be  increased 
which,  together  with  the  fact  that  a  smaller  z   value  is  required  to  prove  signifi- 
cance, make  it  more  likely  that  positive  conclusions  can  be  drawn  from  the  analysis 
of  variance. 

VI.  Computation  for  Two  Criteria  of  Classification 


The  same  data  on  the  yields  of  G-labron  and  Velvet  barley  varieties  are  used  to  illus- 
trate this  case.  It  is  desired  to  determine  whether  or  not  there  is  a  significant 
variation  from  farm  to  farm  as  well  as  between  varieties.  Hence,  the  computations 
will  be  for  total  variance,  that  due  to  farms,  and  that  due  to  varieties.  The  resi- 
dual variance  will  be  obtained  by  subtraction. 
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s(x  -  x)2  -  s  (x2)  -  (Sx)2/it  =  50,599.00  -  1+9,1+13.33  =  1185.62 

n  ff(£i   -  x)2  =  S(xv)    -   (SxT  =   595,l!-8l.OO   -  1+9,1+13.38  =  210.01+ 
•    1  n  12 


n 


«  &(x.   -  x)2  =  S(xf2)    _   (Sx)2  =  100,269.00   -  49,1+13.38  =  721.12 


1 


N 


2 


The  subscripts,  v  and  f  evidently  indicate  "varieties"  and  "farms". 
The  new  tabular  arrangement  now  becomes: 


Variation 
due  to 


D.F. 


Sums 
Squares 


Mean 
Square 


Standard 
Error  (s) 


F -value 


Farms 

Varieties 

Error 

Total 


11 

1 

11 


721.12 
210.0'+ 

25^  M 


210.0^00 
23.1327 


1+.8096 


9.08 


23 


1185.62 


When  the  F -table  is  consulted  it  is  found  that  an  F-value  of  1+.81+  is  required  for  the 
5  per  cent  point.  Thus,  the  added  refinement  through  the  removal  of  the  variation 
between  farms  greatly  increased  the  significance  of  the  difference  due  to  varieties. 

VII.  Introduction  to  Analysis  of  Variance  in  Agrl cultural  Experiment  a 

The  principal  difficulty  to  contend  with  in  field  experiments  is  the  variation  in 
soil  fertility  over  the  area  used  in  experimentation.  The  natural  fertility  usually 
varies  continuously.  The  art  of  planning  an  experiment  lies  in  the  arrangement  of 
the  varieties,  treatments  or  conditions  under  investigation  in  nearby  plots.  They 
are  usually  placed  within  as  small  a  land,  area  as  is  practically  feasible.  The  entire- 
arrangement  is  then  replicated  over  a  larger  area  so  that  the  variations  caused  by 
regional  changes  in  fertility  may  be  removed  from  the  comparison.  The  randomized 
block  arrangement,  and  its  more  restricted  form,  the  latin  square  arrangement,  are 
commonly'  used  to  make  possible  the  removal  of  the  general  effect  of  soil  heterogeneity 


by  means  of  the  analysis  of  variance 
iments1' . ) 


(See  Chapter  en  "Design  of  Simple  Field  Exper- 


In  the  use  of  the  analysis  of  variance  in  field  experiments  it  is  assumed  that  the 
distribution  of  the  plot  yields  is  normal.  I.e.,  that  it  fits  the  normal  curve.  The 
"agronomist  is  familiar  with  the  fact  that  the  variability  between  plots  of  the  same 
variety  grown  on  land  of  high  fertility  is  often  less  than  between  similar  plots  of 
low  fertility,  The  variability  among  plots  of  high  fertility  may  be  considered  as 
restricted  by  what  may  be  termed  "ceiling  effect"  which  imparts  an  abnormal  distribu- 
tion to  the  population.  Fisher  and  others  (1932)  found  evidence  of  negative  skowness 
in  heights  of  barley  plants  selected  at  random  from  plots  that  received,  various 
nitrogen  treatments.  Eden -and  Yates  (1935)  obtained  similar  results  with  height 
measurements  of  wheat  plants.  They  made  a  practical  test  on  these  data  to  determine 
whether  the  validity  of  the  z-test  would  be  destroyed  by  such  non-normal  data.  They 
concluded  that  the  z-test  could  be  safely  applied.  . 
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Questions  for  Discussion 

1.  What  is  meant  by  generalized  standard  error  methods?  Why  are  they  useful  in 
agronomic  experiments? 

2 .  What  are  the  general  features  of  the  analysis  of  variance? 

3.  What  is  the  basis  of  sub-division  of  the  sample  for  one  criterion  of  classifica- 
tion? 

4 .  Why  is  it  logical  to  use  the  variance  for  within  varieties  to  compare  with  that 
between  varieties? 

5.  What  is  the  "z"  test  for  significance?  "F"  test? 

6.  How  may  the  sub-division  of  the  total  sample  into  two  criteria  of  classification 
strengthen  the  experiment? 

7.  What  assumptions  are  made  in  the  use  of  the  analysis  of  variance  for  plot  yield 
data? 


Problems 

1.  Yield  data  in  bushels  per  acre  for  5  wheat  varieties  are  given  on  the  following 
page: 


11: 


Replications 
Variety  12  3  Total 


32.^  3U.3         37.3  10^.0 


■r      £ 


B  20.2  27.5  25.9  73.6 

c  29.2  27.3        30.2  87.2 

D  12.8  12.3  ll:-.  8  39.9 

1. 21.7  gl+,5        23  Jl-  69.6 

Totals  116.3      126.1+    131.6         37I+.3 

(a)  Calculate  the  analysis  of  variance  for  one  criterion  of  classification, 
i.e.,  "between  and  within  varieties.  • 

(b)  Obtain  the  "!?"  value  and  determine  whether  or  not  the  varieties  differ 
significantly  in  yield. 

(c.)  Use  the  "?."  test  to  determine  significance. 

2.  Calculate  the  data  in  problem.  1  for  2  criteria  of  classification,  i.e..,  replicates 
and  varieties.  Determine  whether  or  not  the  varieties  differ  significantly  in 
yield  by  use  of  the  ;;F"  test. 

. 


CHAPTER  XI 
COVARIANCE  WITH  SPECIAL  REFERENCE  TO  REGRESSION 

I.  Relationship  of  Covariance,  Correlation,  and  Regression 

The  concepts  of  covariance,  correlation,  and  regression  are  interwoven,  being  funda- 
mentally equivalent.  Suppose  one  considers  N  pairs  of  measures  that  relate  to  two 
characters  represented  "by  the  variables  x  and  y.  In  the  chapter  on  correlation,  it 
was  seen  that  the  basis  for  the  measurement  of  correlation  and  regression  was  the 
product  sum,  S(x  -  X)(y  -  y).  The  entire  subject  can  well  be  treated  by  the  analysis 
of  variance  principle. 

II.  Analysis  of  Covariance 

Suppose  the  sample  of  N  pairs  of  measures  is  divided  into  m  sub-samples  that  contain 
n  pairs  of  variates  each.  Let  x-^  and  y-j_  represent  the  pair  of  means  that  correspond 
to  the  i-th  sub-sample.  Then,  for  any  pair  of  variates  in  the  i-th  sub-sample  this 
equation  can  be  formed: 

(x  -  x)(y  -  y)  =  [(x  -  Xi)  -  (%  -  x)]   [(y  -  f±)     +     {f±   -  y)  j  -  -  - (l) 

By  an  analogous  procedure  to  the  first  treatment  of  the  analysis  of  variance,  the 
right  side  of  this  expression  may  be  expanded  and  summed  for  all  the  pairs  of  variates 
in  the  i-th  sub-sample,  viz., 

3'  (x  -  x)(y  -  y)  =  S»  (x  -  x^Cy  -  f±)     +  n  (%  -  x)  (y;l  -  y)  . 

It  is  noticed  that  the  two  middle  terms  of  the  expansion  become  zero  for  the  summa- 
tion. The  summation  is  taken  again  to  include  all  the  sub-samples,  viz., 

!fe'(x  -  x)(y  -  y)  =  S  (x  -  x)(y  -  y)  ==  Ss»(x  -  xi)(y  -  y±) 
1  1 

+  n  |  [±±   -  x)(yx   -  y) -  .  - (2) 

The  total  covariance  or  correlation,  S  (x  -  X)(y  -  y),  may  be  most  easily  computed  by 
this  formula: 

S  (x  -  x)(y  -  y)  =  Sxy  -  (Sx)(Sy) - (5) 


m  N 


means , 


The  term,  nS(x^  -  x)(yi  -  y) ,  which  measures  the  covariance  between  sub-sample 
can  be  computed  as  follows: 

m                   m                  m  a  X 
nS^  -  x)(yi  -  y)  =  nS  x^i  -  (Sx)(Sy)  =  S  X2  y£  -  (Sx)(Sy) (k) 

1  1  N        1  a^  a      n 

where  xa  and  ya  are  abbreviations  for  S'x  and  S»y  the  sums  of  the  variates  in  a 
single  sub-sample. 

m 
The  term,  SS'(x  -  %)(y  -  fi),   which  measures  the  covariance  within  the  sub-samples, 

can  be  found  by  subtraction. 

The  computation  is  analogous  to  the  ordinary  case  of  analysis  of  variance.  In  fact, 
it  should  be  incorporated  with  it  for  each  variable  separately.  An  illustrative 
example  will  make  the  computation  and  analysis  clear. 
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III.   Computation  of  Co  variance  \j/ 

The  data  used  to  illustrate  thin  problem  involve  height  measurements  of  5  plants,  from 
each  of  13  inbred  lines  of  sweet  clover  together  with  a  determination  of  the  percent- 
age of  leaves  ("by  weight)  on  each  of  these  plants.  The  data  are  given  in  table  1  for 
height   in  inches    (x)    and  per  cent   leaves    (y)   for  each  of  the   65  plants.. 

Table  1.  Data  on  Height  and  Percent  Leaves  of  5  Plants  from  each  of  13  Lines  of  Sweet 
Clover 


Plant 

Numb 

er 

Toi 

,el ' 

Line 

1 

2 

3 

1+ 

5 

Height 
Sx 

Leaves 

No. 

X 

7 

x 

y 

X 

X 

y 

y~ 

5 

Sy 

(In.) 

do) 

(In.) 

(*) 

(In.) 

(i) 

(In.) 

(*) 

(In.) 

W 

1 

63 

^ 

.66 

38 

39 

ho 

62 

39 

69 

1+0 

319 

202 

2 

TO 

33 

77 

37 

6k 

39 

53 

30 

61 

1+0 

3?-5 

192. 

3 

51+ 

37 

51 

50 

56 

1+9 

61 

35 

56 

1+9 

2T8 

220 

k 

i+o 

50 

39 

kk 

lOfr 

k2 

38 

1+3 

1+5 

1+5 

206 

22)+ 

5 

30 

k9 

1+0 

30 

■U-I4- 

k2 

39 

1+3 

1+0 

kk 

199 

23O. 

6 

1+1+ 

1+8 

50 

30 

54 

kk 

5t 

1+2 

•36 

kk 

238 

228 

7 

58 

58 

60 

38 

58 

1+2 

60 

1+0 

6k 

ko 

300 

198 

8 

5*i 

14-2 

52 

i+8 

Jib 

1+0 

52 

1+8 

kb 

kl 

230 

225 

9 

52 

1+1 

36 

39 

52 

1+2 

kb 

1+3 

1+8 

1+2 

231+ 

207 

10 

38 

1*0 

kS 

1+1 

52 

39 

52 

1+0 

60 

39 

270 

199 

11 

63 

kl 

$h 

1+2 

38 

37 

r-  O 

3d 

1+0 

5'!- 

ko 

292 

200 

12 

1+8 

50 

}+5 

33 

1+3 

c-"0 

1+1 

k9 

l+l 

53 

220 

239 

13 

1+3 

1+7 

31 

in 

1+5 

-  f 

;:>o 

ll-l 

1+3 

l,k 

i+6 

20l+ 

233 

Tot, 

3375 

2819 

Since  there  was  no  replication  of  these  lines  the  total  variability  will  be  divided 
into  only  two  components:      (l)   between  lines  end  (2)  between  plants  within  lines. 
Let  the  height  of  plants  be  designated  as   (x)   and.  the  per  cent  leaves  be  designated 
as   (y) . 

The   sum  of   squares  for  total  variation  in  height   of  plants  will  be: 

S(x2)    -   (Sx)2/lT     =      180,831.0   -  175,21+0.1+     =     3590.6 

The  sum  of  squares  for  variation  between  lines  is  calculated  from  the  sums  of  five 
plants  per  line  as  follows: 

S(x2a  )    -    (Sx)2     =     898,1+67   -  175.21+D.1+     =     1+1+53.0 
-J—  N  5 

In  like  manner  the  total  sum  of  squares  for  per  cent  leaves  will  be: 

s(y2)   -  (Sy)2    =    123,7V?  -  122,257    =    li+89.1 
N 

The  sum  of  squares  for  the  13  lines  in  per  cent  leaves  will  be: 

S(y2.j)      -     (Sy)2     -     613,715   -•  122 ,   237-9     -     88I+.7 
~5~~"  N  5 


■*■  This   illustrative   example   is   one  prepared  by  Dr.   F.  R.    Immer;)   with  minor  modifica- 
tions . 
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The  sum  of  products  for  total  variation  will  "be  obtained  "by  multiplication  of  each 
plant  height  by  the  per  cent  leaves  on  that  plant.  The  results  are  then  summed. 
This  will  be: 

S(xy)   -  (Sx)(Sy)  =  11*, 697  -  1*6,371.2  =  -l67*.2 
K 

The  sum  of  products  for  variation  between  lines  is  obtained  by  a  similar  process, 
viz., 

S(xhyi)  -  (Sx)(Sy)  =  724,01*  -  1*6,371.2  =  *1568.* 

5         N         5 

The  analysis  of  variance  and  co-variance  table  can  now  be  constructed  as  given  in 
table  2. 


Table  2.  Analysis  of 

variance  and 

co -variance 

Variation  due  to: 

D.F. 

Sum  of  Squares 
x2     xy 

due  to: 
1* 

Mean  Sq.  due  to: 
x2     y2 

Lines  (Between) 
Within  Line 3  (Error) 

12 
52 

**53.0  -1568.* 
1137.6  -105.8 

88*. 7 
60*.* 

371. 08**73 .72** 
21.88  11.62 

Total 

6* 

5590.6  -167*.2 

1*89.1 

**Exceeds  the  1  per  cent  points. 

The  sums  of  squares  and  sum  of  products  for  variation  between  plants  within  lines  is 
obtained  by  subtraction. 

Differences  between  lines  with  regard  to  height  of  plants  (x),  and  percentage  of 
leaves  (y),  may  now  be  tested  separately  for  significance  in  the  ordinary  manner.  It 
is  noted  that  these  lines  were  significantly  different  in  both  height  of  plants  (x) 
and  per  cent  leaves  (y),  the  mean  square  for  lines  compared  with  error  being  greater 
than  the  1  per  cent  point. 

However,  there  is  no  method  to  determine  the  significance  of  co-variance  itself  (xy). 
That  is  determined  by  tests  of  significance  performed  on   correlation  or  regression 
coefficients  calculated  from  it.  This  problem  will  be  considered  next. 

IV.  Calculation  of  Correlation  and  Regression  Coefficients 

The  coefficients  of  correlation  can  be  calculated  directly  from  the  sums  of  squares, 
since 

r  =  S(x  -  x)  (y  -  y)     -  -  - -  - (5) 

Vs(x  -  x)2Vs(y  -  y)2 

By  substitution  of  the  sums  of  squares  and  products  for  variation  between  lines, 
given  in  table  2,    one  obtains: 

r=  -1568.*  =     -.790 

-JW&O  788*77 

The  other  correlation  coefficients  can  be  calculated  in  like  manner  from  table  2, 
merely  by  substitution  of  the  sums  of  squares  and  products  found  in  the  appropriate 
row  in  the  table,  for  the  source  of  variability  to  be  considered. 

The  coefficient  cf  regression  of  y  on  x  will  be  given  by  b'S(x  -  xUy  -  v)  i.e., 
prediction  of  y  from  x.  S  (x  -  \)'d 
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By  substitution  of  the  sum  of  products  and  sum  of  squares  for  "lines"  from  table  2, 
b  =  -1566.U  =  -.5522.. 

Wyy .  0 
The  correlation  and  regression  coefficients  are  given  in  table  5. 

Table  3*  Coefficients  of  correlation  and  regression 


Correlation  between  Regression  of 

height  (x)  and  per  per  cent  leaves  on 

Variation  due  to: ^JE.'                     cent  leaves  (y)  _     height  (y  on  x)  

Lines  (Between)        11               -.790**  -.3522 

Within  Lines _J>1 - . XjJjf -.0930 

Total'"'  "6s[_      ""    "-.580**  -".299^  " 

**Exceeds  the  1  nor  cent  point  of  Fisher's  table  V. A.  oz  -     1 


VN  -  3" 

From  Fisher's  table  V.A.  it  is  seen  that  r  ~  .790  is  greater  than  the  expected  value 
of  r  for  ?  =  ,01.  The  chances  are,  therefore,  in  excess  of  99 ;1  against  the  occur- 
rence of  so  large  a  correlation  coefficient  thru  errors  of  random  sampling  from  un- 
corrected material.  The  degrees  of  freedom  for  Fisher's  table  V.A.  are  2   less  than 
the  number  of  pairs  in  the  sample  and  would,  therefore,  be  one  less  than  the  degrees 
of  freedom  in  table  2 . 

The  correlation  coefficient  within  Lines,  r  •--  -.114  is  not  significant.  The  degrees 
of  freedom  are  51  in  this  case. 

V.  Tests  for  Signif i  oance  for  Pegreo  s 1 on  Coefficients 

The  regression  coefficients  can  be  tested  for  significance  by  means  of  an  analysis  of 
variance  or  b?  means  of  a  "t"  test.  The  former  method  will  be  illustrated  first. 

(a)  Test  by  Analysis  of  "variance 

Suppose  there  exists  a  linear  regression  of  percentage  of  leaves  (y)  on  plant 
height  (x)  .  Then  y0;.  the  estimated  percentage  of  leaves  from  a  sample  of  N  pairs  of 
values  of  y  and  x,  is  given  by  the  regression  equation: 

ye  =  a  +  b(x  -  x).  -.-•---."--".-'--  -------  -----  -----  (6) 

In  this   equation,    a  =   y  and  b  =  S(y   -  y)  (,X  -  x)   are  estimates  of  the  true  mean  per- 

slx-lc'F" 
centage  of  leaves  and  the  true  regression  coefficient,    respectively. 

Since  the  regression  equation  can  be  written  as  y  -  ye    -  b(.X   -  x),    it   is 
apparent  that : 

S(y   -  y)2  =  sfy   -[yfc    -  b   (x   -  *)]  [2 

=  £(y   -  ye)2   +  2bS(y   -  ye)(s   -   x)    4  b2S(x   -  x)2 

Due  to  the  fact  that  the  middle  term  on  the  right   is  zero,  v 


V Consider  S(y-ye)(x-S) 

Since  yQ  =4  y  +  b  (x  -  x) ,   we  have: 

S  [  y  -  y  -  b(x  -  x)  |  (x  -  x)  or  S(j  -  y)(x  -  x)  -bS(x  -  x)2. 
It  is  clear  that  the  whole  expression  is  zero  due  to  the  fact  that 
b  *  S(y  -  y)(x  -j£j . 
S  (x  -  x)2 


S  (y  -  y)2  =  S(y  -  ye)2  +  b^Cx  -  x)2 
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Thus,  the  sum  of  the  squares  of  the  deviations  of  the  percentage  of  leaves  has  "been 
analyzed  into  two  components,  one  dependent  on  b,  and  therefore  ascribable  to  regres- 
sion, and  the  other  a  sum  of  squares  that  represents  deviation  from  regression  or 
residual. 

Since  b  =  S(y  -  y)(x  -  x) ,  it  is  obvious  that  the  value  of  b  S(x  -  3c)2,  the  component 

S(x  -  £)'d 
ascribable  to  regression  will  he: 

b2S(x  -  x)2  =  Cs(x  -  Z)(y~-   7)32   -  (8) 

S(x  -  if 

This  procedure  is  now  applied  to  the  illustrative  problem.  The  values  from 
table  2  will  be  used  to  compute  total  regression: 

S(y  -  y)2  =  1489.10 

[s(x  -  x)(y  -  y^l2  =  (-1674. 2 )2   =  501.57 
s(x  -  x)2         5590 . 6 
The  analysis  to  test  the  significance  of  total  regression  follows  in  table  4. 

Table  4.  Analysis  of  Variance  to  Test  the  Significance  of  Total  Regression 


Variation 

D.F. 

Sum  of  Products 

Mean  Product 

F -value 

Due  to  Regression 
Deviations  from  Regress. 

1 
65 

501 .37 
987.73 

501.37 
15.68 

31.98** 

Total 

64 

3M9.10 

The  total  sum  of  squares  for  y  (leaf  percentage)  is  taken  directly  from  table  2. 
Here  y  is  used  as  the  dependent  variable,  i.e.,  y  (leaf  percentage)  is  predicted  from 
x  (plant  height)  which  is  known.  The  sum  of  squares  due  to  deviations  from  regres- 
sion is  obtained  by  subtraction,  i.e.,  .U1-89.IO  -  501.37  =  987.73.  There  will  be  one 
degree  of  freedom  due  to  linear  regression  with  a  remainder  of  N-2  degrees  of  freedom 
for  deviations  from  regression.  It  is  also  to  be  noted  that  N-2  is  the  number  of 
degrees  of  freedom  used  to  test  the  significance  of  r  (Fisher,  Table  V.A).  It  is 
obvious  from  table  4  that  the  regression  coefficient  is  highly  significant,  since  the 
"F"  value  exceeds  the  one  per  cent  point.  The  same  conclusion  was  obtained  when  r 
was  tested  for  significance.  In  fact,  the  two  tests  for  significance  are  equivalent. 
When  the  correlation  coefficient  is  significant,  the  regression  coefficient  must  be 
significant,  and  vice  versa. 

To  test  for  the  significance  of  regression  between  lines,  the  values  already 
computed  for  that  source  of  variation  in  table  2  are  used: 

S(y  -  y)2  =  884.7 

CS(*  -  x)(y  rJ±P     a   L4168J4_}2  .   552.4 

s(x  -  s)2       TT53 .0 

The  values  are  summarized  in  table  cj\ 

Table  %   Analysis  of  Variance  to  Test  the  Significance  of  Regression  Between  Lines 


Variation             D.F. 

Sums  of  Products 

Mean  Product 

F -value 

Due  to  regression        1 
Deviations  from  regress.  11 

552.4 
332.3 

552.40 
30.21 

18.29** 

Total                  12 

884.7 

1.18 

It  is  thus  evident  that  the  regression  between  lines  is  extremely  significant.  The 
regression  within  lines  will  not  ho  tested  for  significance  since  r  is  not  signifi- 
cant (See  table  3) • 

0: '0  The  "t"  Test  of.  Significance 

Regression  coefficients  may  he  tested  by  means  of  the  "t"  test  also  (See 
Fisher _,  193Jv  Pp.  126-137)  •  As  an  illustration,  the  significance  of  the  regression 
of  y  on  x  between. lines  may  be  tested.  From  table  3;  b  =  r0'.yj>22}   S(x  -  x)   = 
44.53.0,  and  s(y  -  y)'c  =  83h  . 7 . 

Then ., 


where 


Tlien, 


t  -  b  Vs  (x  -  x)2  -  -  -  - ---..-- -  ~,  r  -  (9) 

P 

s2  =  S(y  -  y)2  j^^ix^  £1? ~  --~™  -  (10) 

m  -  2 


06^.7  zJ&iE&^MShQl  -  22L-JL  =  5°-21 


13-2  11 

.496 


=     °« 3-^22  V4455.O  =  4,276  for  11  D.F. 
J 


r-  .496 


From  the  "t"  table  it  is  obvious  that  the  observe!  t -value  exceeds  the  1.0  per  cent 
point . 


Since  -fT    =  t  for  one  degree  of  freedom,  it  is  noted  that -/F  =  VlS.39  ■  = 
4.277  (from  table  5).  Thus,  it  is  apparent  that  tests  of  significance  of  regression 
coefficients  by  means  of  the  analysis  of  variance  arid  the  "t"  test  are  equivalent. 
Moreover,  they  give  the  same  result  as  tests  of  significance  of  the  correlation 
coefficient  (Table  V  Ag  Fisher,  193'+)  • 

VI.  Substitution  in  Regression  Equation 

The  regression  equation  is  usually  expressed  as  ye  =  y  +  h(x  -  x) . 

For  such  a  regression  between  lines  one  may  substitute  y  -  kj.yjjX   =  51.92  and  b  = 
-.3p22.  The  mean  values  of  y  and  x  are  obtained  directly  from  table  1  "oj   division  of 
the  totals  by  65.  The  value  of  b  is  taken  from  table  3.  Numerically  yQ  -  43.37  - 
0.3522  (x  -  51.92).  This  regression  equation  can  be  simplified  to  ye~  61.66  -  O.3522 
x,  where  x  is  any  value  of  plant  height.   In  table  6  is'  given  the  mean  height  of'  each 
line,  the  mean  leaf  percentage  of  each  line  and  the  leaf  percentage  predicted  from 
plant  height  by  means  of  the  equation  above. 

Table  6.  Observed  Mean  Height',  Mean  Leaf  Percentage  and  Predicted  Leaf  Percentage  of 
the  13  Lines  of  Sweet  Clover. 


Observed 

Observed 

Predict el 

Observed 

Observed 

Pr 

edict ed 

Line 

mean 

mean  $ 

mean  $ 

Line 

me  an 

mean  $ 

mean  \ 

ho . 

height 

leaves 

leaves (ye) 

No. 

he  i  flit 

leaves 

le 

aves  (ye) 

(x) 

•  (y) 

U) 

(y) 

1 

63.8 

4o.4 

39-2 

8 

50.0 

45.0 

44 . 0 

2 

65.O 

33,4 

58.8 

9 

50.8 

.   41.2 

43.8 

3 

35.6 

44.0 

42.1 

•  10 

s4 . 0 

39.8 

42 . 6 

4 

4l  .2 

44.8 

47.1 

11 

58.4 

40.0 

4.1.1 

5 

39-3 

46.0 

47.6 

12 

44.0 

31.8 

46.2 

6 
7 

51.6 
60.0 

45.6 
39.6 

43.5 
40.5. 

13 

40.8 

47.0 

V7.3 
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The  differences  between  observed  mean  leaf  percentage  and  predicted  leaf  percentage 
in  table  6  represent  errors  in  prediction.  The  sum  of  squares  of  these  differences 
would  be  given  by  S(y  -  ye)  where  y  represents  the  observed  mean  leaf  percentage  and 
ys  tho  predicted  value.  This  quantity  can  bo  computed  from  table  5  hy  subtraction  of  the 
observed  and  predicted  leaf  percentage,  these  values  being  squared  and  added  to  give 
S(y  -  ye)  -  66. ')k.     "Now  this  sum  of  squares  is  based  on  means  of  5  plants  per  line 
while  the  analysis  of  variance  in  table  k   was  on  a  single  plant  basis.  Therefore, 
multiplication  of  66. ^k  by  5  to  place  it  on  a  single  plant  basis  gives  332.7 •  This 
agrees  with  the  sum  of  squares  due  to  deviation  from  regression,  i.e.,  332.3  consider- 
ing that  the  predicted  leaf  percentages  have  been  computed  to  only  one  place  of  deci- 
mals. 

It  may  be  noted  also  that  s2  used  in  the  "t"  test  could  be  written  s2  =  S(y  -  yft)2, 

m  -  2 
since  S(y  -  ye)2  =  S(y  -  y)~  -  bwS(x  -  x)  ,  the  latter  form  being  simpler  for  compu- 
tation purposes. 

While  the  application  of  analysis  of  variance  and  co-variance  to  correlation  and  re- 
gression problems  has  been  illustrated  here  with  data  from  a  very  simple  experiment, 
it  is  evident  that  it  is  equally  applicable  to  problems  of  any  degree  of  complexity.^ 
The  analysis  of  variance  and  co- variance  are  keyed  out  for  the  particular  problem 
under  investigation  after  which  the  correlation  and  regression  coefficients  are  cal- 
culated for  scay   component  of  the  total  variability.  The  tests  of  significance  are 
made  in  a  manner  similar  to  the  ones  illustrated. 

VII.  Use  of  Covariance 

The  analysis  of  covariance  is  often  successfully  applied  in  an  artificial  reduction 
of  experimental  error  in  certain  types  of  experiments  where  preliminary  or  uniformity 
trial  data  are  available.  There  may  be  factors  which  it  is  impossible  to  equalise 
satisfactorily  between  the  different  treatments,  and  yet  there  may  be  reason  to  sup- 
pose that  greater  accuracy  would  arise  from  their  equalization,  were  that  possible. 
Availability  of  preliminary  data  may  provide  the  basis  for  such  equalization. 

The  possible  use  of  data  from  a  previous  uniformity  trial  to  reduce  errors  due  to 
soil  heterogeneity  in  the  experimental  years  has  been  given  considerable  attention  in 
field  trials  in  recent  years.  The  assumption  is  that  soil  fertility  is  constant  from 
year  to  year.  Thus,  a  significant  correlation  between  the  seme  plots  in  successive 
years  may  be  used  to  reduce  the  error  in  the  experimental  year.  The  regression  equa- 
tion is  applied  to  predict  the  yields  in  the  experimental  year  from  the  yields  of  the 
same  plots  grown  under  uniform  treatment  in  the  previous  year.  The  deviations  from 
the  predicted  yields  should  then  contribute  to  the  error  of  the  experiment .  Methods 
to  utilize  information  from  previous  crop  records  have  been  outlined  by  Fisher  (193*0* 
Sanders  (1930),  Eden  (1931),  and  by  Wishart  and  Sanders  (1935).  With  annual  crops, 
Summerby  (193*0  found  that  it  was  not  worthwhile  to  sacrifice  a  year  to  a  uniformity 
trial  in  order  to  obtain  information  to  reduce  the  error  in  the  experimental  year. 
The  method  seems  to  have  the  greatest  possibilities  with  perennial  crops.  (See 
Fisher,  1937). 

Another  possible  application  of  covariance  arises  where  stand  counts  are  available 
in  addition  to  yield.  Mahoney  and  Baton  (1939)  have  made  such  an  application.  Stand 
counts  may  furnish  a  good  index  of  plot  variability  provided  they  have  been  unaffect- 
ed by  treatment.  Correction  for  stand,  which  can  be  made  from  the  regression  relation. 


VFor  a  consideration  of  curvilinear  regression  and  its  treatment  by  the  analysis  of 
variance,  the  reader  is  referred  to  more  advanced  works  on  the  subject. 
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provides  an  adjustment  of  the  data  to  what  they  would  "be  if  all  plots  had  the  name 
number  of  plants  (proportionality  assumed).  When,  yield,  is  related  to  plant  number  it 
is  obvious  that  the  experimental  error  will  be  decreased  when  this  factor  is  taken' ' 
Into  account  and  a  correction  made  for  it.   It  is  first  necessary  to  determine  whothe? 
or  not  such  a  relationship  exists. 

The  simpler  aspects  of  covarlance  as  applied  to  between  and  within  groups  have  al- 
ready "been  considered.  The  method  will  be  used  here  for  ordinary  field  experiments 
where  the  total  variation  is  sub -divided  into  more  than  two  parts.  An  illustrative 
example  used  by  Fisher  (193'M  will  be  followed. 

VIII.  Use  of  Preliminary  Trial  Data  for  Srror  Reduction 


_Si ' 


Some,  data  collected  by  Eden  (19,31)  Sf   on  tea  will  bo  used  to  Illustrate  the  calcula- 
tions  for  cova.ria.iice  for  preliminary  and  experimental  yields.  Four  "dummy"  treat- 
ments for  yields  of  tea  expressed  in  per  cent  of  the  mean  in  a  randomized  blocl:  ex- 
periment are  given  in  table  7.  ' 

Table  7.  Preliminary  and  Experimental  Yields  of  Tea  Plants 

Pr el iminary  ( x ) 


or  Blocks  _  Treatment 

Treatment   Experimental  (y)   1    2  5     4        Total      Mean 


m. 


A            x  91  118  109  102 

y  6?  121  114  107 

3           x  68  94  105  91 

y  8l  93  106  92 

C           x  88  110  115  96 

y  90  106  111  102 

D           x  ■  102  109  94  88 


Q*5       ]  "i  1| 


s 


7  95   10!    .9: 


Block  Total  x  369        k:)l        423 

. .  •■ .  y  .  349      '  k$k       424 


420 

103.00 

427 

106.75 

378 

94 .  50 

.572 

93.00 

409 

102.25 

409 

.102.20 

593 

98.2p 

'•SQp 

98.00 

loOO 
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The  preliminary  yields  will  be  designated  as  x  and  the  experimental  yields  as  y  in 
the  subsequent  calculations.  '  '-■    '■-■'' 

(a)  Analysis  of  Variance  and  Covarlance  for  Preliffd.narv  and  Experimental  ^Yields . 
The  sums  of  squares  for  preliminary  yields: 
Total:   Sx2  -  (gx)2  -   1326.0 

"  N 
Treatments:   Sx^  -  (S:0~  -  253.3 

Blocks:      Sx|    (Sx)2  ~   743.0 

Sums  of  squares  for  experimental  yields: 

Total:      Sy2    -    (Sy)2     =      2040.0 

,.     '  IT      '  •■.;."■■ 

■   Blocks:   Sy2    -    (Sy)£     =      1099- 5    '_ 

Treatments:     Sy?      -   (Sy)2     =     414.5 
: *F      '     N 


lfM- 


Cited  by  P.  A.  Fisher  (1934) 
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Suns  of  Products: 

Total:         Sxy  -  (Sx)(Sy)     =     1612.00 

Treatments:     Sxtyt       _  (Sx)(Sy)       =  323-25 

~t » 

Blocks:     Sxtyt  -  (Sx)(Sy)  =  837. 0 

The  above  results  are  incorporated  In  table  8. 
Table  8.  Analysis  of  Variance  and  Covarianco 


Variation 
due  to 


Sums  of 
.  Squares 

D.F.  U)   (y) 


Sums  of 
Products 

(*y) 


Mean 
Square  s 

(x)   (y) 


F -Value 


(x)     (y) 


Blocks  3  7^5.0  1095.5  837.OO  2^8.33  365.17 
Treatments  3  253.5  hlk.5  323.25  81+.  50  138.17 
Error       9   527-5   530.0   ^51-75   58.61   58.89 


k.2k*     6.20* 
l.M-   2.35 


Total 


15  1526.0  2C40.0  1612.00 


From  this  analysis  it  is  clear  that  no  significance  resulted  between  yields  in  case 
of  the  "dummy"  treatments,  while  a  considerable  degree  of  soil  heterogeneity  evidently 
exists  because  the  variation  between  blocks  proved  significant  for  both  the  prelimi- 
nary and  experimental  data. 

It  is  now  proposed  to  test  the  covariance  as  a  basis  to  provide  a  correction 
for  the  mean  experimental  yields  in  an  effort  to  reduce  the  soil  heterogeneity  effect 
further.  The  analysis  of  covariance  is  given  in  table  9* 

Table  9-  Analysis  of  Covariance  and  Test  of  Significance  of  Adjusted  Experimental 

Means 


Variation  due  to 

D.F. 

Sum  of 
Squares 
(x) 

Sum  of 
Product 

Sum  of       Errors 
s  Squares  Sums 

(y)       Squares 

of  Estimate 

Mean 
D.F.  Squares 

Blocks 
Treatments 

Error 

■7. 
> 

3 

0 

7^5-0 
253.5 
527.5 

837.00 
323.25 
1*51.75 

1095.5 
klk.5 

530.0      li+3.12 

8 

17.89 

Total 

Tr.   +  Error 

15 
12 

1526.0 
781.O 

1612.00 
775.00 

20^0.0 
9kk.  5      175.^5 

11 

15.95 

Test   of   significanc 

e  for 

adjusted  ■ 

treatment 

means              32 . 3? 

3 

10. 781 

3-F 

»  17.89/H 

D.78    a    1. 

66  non- significant 

Since  the  total  has  been  broken  down  into  more  than  two  parts,  it  is  necessary  to 
form  a  new  total  which  contains  only  the  two  effects  under  study,  viz.,  treatment  and 
error.  This  new  total  is  in  the  line,  treatment  +  error,  in  table  9.  The  degrees  of 
freedom,  sums  of  squares,  and  products  are  added  to  obtain  the  appropriate  numbers. 

The  sums  of  squares  for  errors  of  estimate,  S(y  -  ye)2,  are  calculated  by  use  of  the 
principle  of  subtraction,  viz., 
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[Sy2  -  (Sy)2/uJ  -  fsxy  -  (Sx)(Sy)  /n]2  ,  in  the  lines  for  error,  arid  treatment 

Sx2  -  (Sx)2/K 
+  error.  These  computations  are  as  follows:  •.". 

(1)  Error:  530.0  -  (k51.75)2/527.5  =  1^3.12 

(2)  Treatments  +  Error:  9I&.5  -  (775)2/f8l  =  175. hO 

The  sums  of  squares  for  error  is  subtracted  from  that  for  treatment  +  error  to  yield 
the  sura  of  squares  appropriate  for  the  test  of  significance  for  the  adjusted  treat- 
ment means ,  viz.,  175.45  -  143.12  =  32.33. 

Ill  this  particular  case,  the  mean  square  for  adjusted  treatment  means  is  not  signi- 
ficantly different  from  error  since  "dummy"  treatments  were  used. 

(h)  Calculation  of  the  Regression  Coefficient 

The  regression  coefficient  (b)  is  calculated  from  the  values  in  table  9«  The 
regression  required  is  the  regression  of  y  on  x  in  the  row  designated  error.  Since 
the  regression  coefficient  is  the  ratio  of  thus   products  to  the  sums  of  squares  of  the 
independent  variable , 

b  =  Sxy  -  (Sx)(Sy)/N    *  ^51 .75  =   0.8564 
Sx2  -  (Sx)2/N  527.50 

The  significance  of  the  error  regression,  b  =  0.&)6h,    should  be  tested  at  this  point. 
Unless  it  is  significant,  there  will  be  little  advantage  to  use  it  to  reduce  the 
error  for  the  experimental  year.  The  sum  of  squares  due  to  linear  regression  will 
be: 


(Sx)(Sy)/K|2   -      (kJL.  75)2  -     386.8 


Q£  33 


Sx2   -    (Sx)2/W  527.50 

The  test  for  significance  is   summarized  in  table  10. 

Table  10.  Test   of  Significance  for  Error  Regression 


Variation  Sum?  Mean 

due  to  Formulas  ..    D.F.      Squares       Square 


TP 


Regression  [Sxy   -   (Sx)(Sy)/ft]2  1  386.88         386.88         2 1.6 3 

Sx2  -■  (Sx)2/lfrr 


Deviations  from  r      ,  vo't  ,-  .  <  no 

regression         Sy2   -  J&dg  YSSSLzJ^)  (g.?lM2     1^3.1?  17-86 

_L '       g     J      Sx2    -   (Sx)2/N 

Total  for  Error     Sy2    -    (Sy2/W  9  530.00  58.89 

-Error  for  adjusted  yields. 

The  observed  F -value -is  highly  significant.   It  indicates  that  it  will  be  worth  while 
to  proceed  with  the  correction  of  the  experimental  test  data  on  the  basis  of  their 
regression  on  the  preliminary  yields. 

( c )  The  Adjusted  Treatment  Means 

The  adjusted  treatment  means  can  be  calculated  and.  compared  with  the  unadjust- 
ed. The  formula  for  the  adjusted  values  is,  ye  -  bx,  where  y  is  the  individual  treat - 
' "meet  in  the  experimental  year.  The  computations  are  given  In  table  11.  The  mean 
yields  per  treatment  of  the  original  data,  x  and  y,  are  taken  directly  from  table  8. 
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Table  11.  Calculation  of  Mean  Yields  of  Treatments  in  Experimental  Test  corrected  for 
Yields  in  Preliminary  Test. 

Mean  Yield    Deviations  Mean  Yield    Corrected  Yields  for 

Treat-   Preliminary   from  Mean   Product 1  Experimental  Experimental  Test 
ment    Test  (x)      (x  -  x)     b(x  -  x)  Year  (y)      ye  -  "b(x  -  X) 


A 

105.00 

5.00 

I*  .28 

106.75 

102.  J+7 

B     • 

9^.50 

-5.5O 

-U.71 

93.00 

97.71 

C 

102.25 

2.25- 

1.95 

102.25 

100.32 

B 

93.25 

-1.75 

-1.50 

98.OO 

99.50 

Gen. 
Mean 


100.00 


0.00 


0.00 


100.00. 


100.00 


h  =  Sxy  -  (Sx)(Sy)/N  =  ,  0.85&- 
Sx2  -  (Sx)2/N 
The  regression  equation  for  error  for  x  on  y  is  as  fo'lows: 

ye  =  y  +  b  (x  -  x>  =  100.0  +  0.856^  (x  -  100.00)  =  0.Qr)6kx   +  1^.36 

The  graphical  representation  is  shown  "below,  the  points  for  the  determination  of  the 
line  determined  "by  substitution  in  the  regression  equation. 

Let  x  =  92. 5, ye  =  95.58.  Let  x  =  K>7.5,yQ  =  106.^2. 
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Yields  in 
experimental 
year  (y) 


90.   95    100     105    110 
Yields  in  preliminary  test  (x) 


(d)  Standard  Error  of  a  Diff erenco 

The  standard  error  of  a  given  difference  between  the  corrected  mean  yields 
is  given  by  Wishart  and  Sanders  (1936)  as  follows: 


cr  _ 
ex 


=  /2s2 


*2(x- 


xo 


)2 


(11) 


n 


A' 


where  s2  =  the  variance  of  the  corrected  yields  (17.89),  n  =.  the  number  of  plots  per 

treatment  (h) ,   A'  =  the  sum  of  squares  for  error  in  the  original  preliminary  trial 

(527o) ,   and  X]_  and  %>  =  the  means  of  the  preliminary  treatment  plots  being  compared 
(105.00  -  9^.50  =  10.50). 
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For  treatment  A  and  B,  the  mean  difference  of  the  corrected  yields  (table  11)  is 
102. -'+7  -  97.71  =  4.76.  The  standard  error  is  computed  as  follows; 


°"c*  =  /2(1T.S9)  +  (17769)  (10. 50)2  =  3-56 
■V   *  527.5 

^/^oX  =  4.76/3.5o     =     1.34,   a  non-significant  value. 


(e)  Factors  in  Use  of  Independent  Variable 

The  investigator  usually  wishes  to  know  when  it  is  worthwhile  to  introduce 
the  independent  variable  into  the  experiment.  This  question  is  answered  by  Snedecor 
(1937)  who  states  that  three  items  will  aid  him.  First,  the  list  of  actual  and  ad- 
justed means.  Sometimes  the  rank  order  of  adjusted  means  is  quite  different  from 
that  of  the  unadjusted  and  the  shifts  may  be  interpreted.  Second,  a  comparison  of 
the  sum  of  squares  of  errors  of  estimate  (table  8)  used  to  test  treatment  signifi- 
cance, 32.33,  with  Sy2  -  (Sy)2/w  =  klk .5.     The  latter  is  far  greater' than  the  former. 
Third,  the  change  in  precision  of  the  experiment  due  to  the  adjustment  of  the  error 
sums  of  squares.  This  is  indicated  in  table  8.  The  sum  of  squares,  Sy2  -  (Sy)£-/w  = 
530.00  with  9  degrees  of  freedom,  is  analyzed  into  two  parts,  one  with  a  single  de- 
gree of  freedom  that  measures  the  variation  attributable  to  regression,  the  other  8 
degrees  of  freedom  being  assigned  to  error.  The  mean  square  for  error  is  reduced 
from  53.59  to  17.89,  which  is  highly  significant.  These  factors  will  enable  the  in- 
vestigator to  decide  whether  to  retain  the  independent  variable  in  similar  experiments. 
It  has  already  been  mentioned  that  the  use  of  preliminary  uniformity  data  to  reduce 
the  error  in  the  subsequent  experimental  test  may  be  useful  in  perennial  crops,  but 
probably  is  not  worth  while  for  annual  crops. 
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Questions  for  Discussion 

1.  What  is  covariance?  Where  useful  in  experimental  work? 

2.  Interpret  the  use  of  the  analysis  of  variance  for  the  determination  of  signifi- 
cance of  the  regression  coefficient. 

3«  How  is  the  error  for  linear  regression  computed? 

h.   How  can  covariance  he  used  on  preliminary  trial  data  to  reduce  the  error  in  the 
experimental  year? 

5.  Discuss  conditions  in  field  experimentation  where  it  might  he  useful  to  use 
preliminary  trial  data  to  reduce  the  error  in  the  subsequent  test. 

6.  What  assumption  is  made  in  the  correction  of  stand  "by  covariance?  What  precau- 
tions are  necessary? 

7.  Upon  what  is  the  error  of  estimate  based?  Explain. 

8.  What  does  it  mean  when  the  difference  to  adjust  treatment  means  actually  is  less 
than  the  mean  square  for  the  error  of  estimate  for  error? 

9.  Name  3  types  of  agronomic  tests  where  covariance  might  prove  useful.  Give  the 
reason  in  each  case. 


Problems 

1.  The  yields  of  soybeans  in  a  randomized  block  experiment  with  split  plots  are  given 
below.  Let  x  represent  the  yield  of  hay  in  tons  por  acre  and  y  represent  the 
yield  of  seed  in  bushels  per  acre. 

The  total  yields  of  the  k   plots  of  each  spacing  are  assembled  below. 

Bu.  of  seed  per  acre  (y) Tons  of  hay  per  acre  (x) 


Width 

Spa 

cing  w: 

Lthin  rows 

Width 
of  rows 

Spf 

acing  within  rows 

of  rows 

1/2" 

1" 

2" 

■  r 

Sum 

1/2" 

1" 

3" 

Sum 

16" 

89.3 

91.8 

79.6 

■  88.6 

3^9.8 

16" 

11.1+0 

10.72 

9.63 

9.68 

1+1.1+3 

20" 

92.7 

85.6 

37.2 

87.I 

352.6 

20" 

II.3I 

10.06 

9.73 

9.31 

1+0.1+1 

2k" 

90.6 

82.3 

8H.3 

80.7 

337-9 

2kn 

10.02 

9.21 

9.00 

8. to 

36. 61+ 

28" 

86.0 

83.O 

82.1+ 

78.3 

329.7 

28" 

0.62 

9.  in 

°.09 

8.28 

36.1+3 

32" 

85.I 

78.1+ 

7^.6 

72.9 

311.0 

32" 

9.53 

8.72 

8.1+5 

7-77 

3^.52 

1+0" 

78A 

70.7 

71.7 

69.2 

290.O 

1+0" 

8.31 

8.19 

7-3^ 

7-59 

31.93 

Sum    522.6   1+91.8  1+79.8  1+76.8  1971.O   Sum   60.7!+  56.3I+  53.2^  51,0>  221.36 
The  analysis  of  variance  for  x  and  y  are  given  below 

Variation  due  Correlation   Regression 

to: D.F. (y-j) (x-x)(y-y)   (x-x)    of  x  and  y  of  y  on  x 

Blocks  3  IO.I+370  .II+38 

Width  of  rows  5  182.0500  3.968!+ 

Error  (a) 15 52.55I+2 .J___3 

_______  23  22  5. 01+12 1+.  I+655 

Spacings  3  5^.7512  2.2108 

Width  x  spacing         15  30.1+300  .2389 

Error  (b) ___+ 268.6038 1.598^ 

Total  95  578.8262  '  8.5115 
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(a)  Calculate  an  analysis  of  co- variance. 

(t>)  Calculate  the  correlation  coefficient  for  tho  different  lines  in  the 
'analysis  of  variance  and  co-variance. 

(c)  Do   the  same  for  the  regression  coefficient  of  y  on  x. 

(d)  Test  the  significance  of  the  correlation  coefficients  'by  means  of  Fisher's 
table  V.A.  Mark  the  coefficients  which  exceed  the  c)   per  cent  point  with 
one  asterisk(-*)  and  those  which  exceed  the  1  per  cent  point  with  two 
astericks  (**) . 

(e)  Test  the  significance  of  the  regression  for  error  ("b)  "by  means  of  an  analysis 
of  variance.  G-et  the  -fl?   also. 

(f)  Test  the  significance  of  the  regression  for  error  ("b)  "by  means  of  "t"  test. 

(g)  Test  the  significance  of  the  correlation  coefficient  for  error  (b)  by  means 
of  the  "t"  test;  given  by  Fisher  in  section  jk   of  his  book. 

(h)  Calculate  the  mean  yield  of  seed  In  bushels  for  the  six  different  width  of 
rows.  Calculate  the  predicted  mean  yields  for  each  width,  using  the  regres- 
sion for  width  of  rows. 


Some  data  on  number  of  sugar  beet  plants  per  plot  and  yield  in  tons  per  acre  are 
given  by  Snedecor  (1937)  for  a  fertilizer  experiment  conducted  in  a  randomised 
block  test.  The  data  for  p  replications  are  given  below: 


Fertilizer    No.  (x)  or 
Applied       Yield  (y) 


None 


x 

y 


Block 
1     2  5 

I83   176   291 
2.M     2.2p  k.^Q 


Treatment 
Sum 


Sums  Sums 

S  c:  uai'O  s     Product  s 


x 

7 


356       300        301 

6.71     3.  Mi-    ]+.92 


K 


x. 

y 


22  k 

p.<-£ 


2p8       2kk 
k.lk     2.32 


?K 


x 

y 


6 .  34 


303 

.22 


.  1, 


PIT 


x 

y 


371      33^      332 
6.kB    7.11     5.88 


KN 


NPK 


Block  Sums 


x 

y 

X 

y 


X' 

y 


230     221 

3.70     3.2*f 


237 
2.82 


322      367      k0( 


6.10     7-^8     7.37 


Svan.  Squares 
Sums  Products 


x 

y 


Due  to  the  fact  that  the  number  of  plants  varies  it  is  necessary  to  examine  the 
effect  of  the  variable  stand  and  to  estimate  the  yields  on  the  basis  of  equal  numbers 
of  plants.   Calculate  as  follows: 


127 

(a)  Yield  in  a  simple  randomized  "block  experiment. 

(t>)  The  analysis  of  covariance  of  stand  and  yield. 

(c)  Calculate  the  test  for  significance. 

(d)  Give  the  conclusions  for  the  test. 

3.  A  sugar  "beet  variety  test  was  conducted  at  Rocky  Ford  as  a  randomized  block  exper- 
iment in  -which  tho  number  of  plants  differed  in  each  plot.  The  yields  were  taken 
on  the  basis  of  competitive  plants  per  plot,  and  also  on  the  basis  of  all  the 
beets  in  the  plot .  The  object  of  the  experiment  was  to  determine  the  yields  of 
the  different  varieties.  Since  the  number  of  beets  varies  from  plot  to  plot,  it 
is  desired  to  examine  the  effect  of  the  variable  stand  and  to  estimate  the  yields 
on  the  basis  of  equal  numbers  of  beets.  The  data  follow  (unpublished  data  from 
0.  W.  Deming) : 


Variety 

No 
Yi 

.00 

eld 

or 

(y) 

Block 

Treatment 

No. 

1 

0 

3 

4 

5 

6 

Totals 

1 

X 

y 

243 

17.49 

217 
I8.65 

227 
1.4.39 

210 
16 .33 

218 
11.28 

215 
15.75 

2 

X 

y 

245 
17.99 

217 
19.22 

239 
18.21 

210 
14.11 

205 
16.84 

219 
13.76 

3 

X 

y 

238 
14.13 

228 
16.62 

205 
15.99 

191 
13.81 

224 
14. 80 

211 

13.00 

4 

X 

y 

254 
20.19 

223 
21.45 

I89 
14.01 

180 
12.54 

23.6 
14.20 

209 
17.65 

5 

X 

y 

249 
20.08 

221 

17.04 

226 
14.04 

242 
16.05 

246 
13.86 

216 
9.75 

6 

X 

y 

225 
17.49 

212 
19.63 

194 
17.55 

211 
16.02 

202 
15.46 

215 
14.05 

Block 

X 

Totals 

y 

Calculate  the  analysis  of  covariance  and  adjust  the  yields  to  a  uniform  stand 
basis. 


FIELD  PLOT  TECHNIQUE 


PAST  III 


Field  and  Other  Agronomic  Experiments 


CHAPTER  XII 
SOIL  HETEROGENEITY  AND  ITS  MEASUREMENT 

I.  Universality  of  Soil  Heterogeneity 

One  of  the  difficulties  in  yield  teste  is  the  fact  that  uniform  soil  conditions 
rarely  exist,  even  over  a  small  portion  of  any  field.  Soil  variability  has  "been 
noted  "by  many  investigators,  "but  it  was  J.  Arthur  Harris  (1915)(1920)  who  first  pre- 
sented data  to  show  its  extreme  importance  in  field  experimentation.  Lyon  (19H) 
states  that  it  is  "quite  likely  that  productivity  of  plots  change  from  year  to  year 
even  with  the  same  treatment",  altho  the  work  of  Harris  and  Schofield  (1920)  (1928) 
and  of  Garber,  et  al.  (1926)(1930)  indicates  a  tendency  for  the  differences  in  plot 
yields  to  "be  permanent . 

A  soil  with  differences  so  slight  as  to  escape  the  most  oh servant  eye  may  have  very 
great  effects  on' plants  which  grow  in  it.  Parker  (1931)  is  authority  for  the  state- 
ment that  two  plots  of  the  same  crop  variety  grown  in  "an  apparently  uniform  soil 
and  treated  alike  in  every  respect  may  differ  from  one  another  in  yield  "by  20  per 
cent  or  more  solely  as  a  result  of  differences  in  soil  conditions."  Small  plots  have 
generally  replaced  large  ones  to  correct  for  this  condition,  because  it  is  obvious 
that  two  plants  of  the  same  variety  grown  one  yard  apart  are  more  likely  to  yield 
alike  than  when  200  feet  apart  as  probably  would  be  true  of  one-acre  plots.  It  is 
impossible  to  avoid  variation  even  under  such  conditions.  Davenport  and  Frazor 
(I896)  report  results  with  77  variotios  of  wheat  grown  on  plots  two  rods  square. 
Nine  check  plots  of  the  same  variety  were  systematically  distributed  over  the  area. 
The  variation  in  the  check  plots  was  so  great  that  only  8  varieties  yielded  more  than 
the  highest  check,  and  but  3  lower  than  the  lowest  check. 

Soils  vary  in  texture,  depth,  drainage,  moisture,  and  available  plant  nutrients  from 
yard  to  yard.  After  the  analyses  of  large  amounts  of  data  from  all  over  the  world, 
Harris  (1920)  concluded  that  soil  heterogeneity  was  practically  universal.  He  esti- 
mated it  to  be  the  most  potent  cause  of  variation  in  plot  yields  and  the  chief  diffi- 
culty in  their  interpretation.  In  1915  he.  stated:   "It  is  obviously  idle  to  conclude 
from  a  given  experiment  that  variety  'A1  yields  higher  than  variety  'B1,  or  that 
fertilizer  'X'  is  more  effective  than  fertilizer  'Y1,  unless  the  differences  found 
are  greater  than  those  which  might  be  expected  from  differences  in  the  productive 
capacity  of  the  plots  of  soils  upon  which  they  are  grown."  Even  earlier  than  this, 
Piper  and  Stevenson  (1910)  remarked  that  soil  variability  was  so  great  that  "doubt 
was  cast  on  the  greater  portion  of  published  field  experiments  where  yield  was  pri- 
marily involved."  The  yield  differences  must  be  large  enough  to  overshadow  soil 
variation,  or  the  experiment  designed  so  as  to  remove  its  effect. 

Much  of  the  improvement  in  experimental  methods  for  field  experiments  in  recent  years 
has  been  brought  about  thru  special  devices  to  measure  much  of  the  soil  fertility 
variation  and  essentially  eliminate  it  from  the  actual  comparisons  being  made. 

II.  Uniformity  Trial  Data 

Uniformity  trial  data  ha.ve  been  used  for  the  measurement  of  soil  heterogenity  as 
well  as  for  many  other  purposes  in  field  experimentation.  The  usual  procedure  is  to 
plant  a  bulk  crop,  the  area  being  later  partitioned  into  small  plots,  usually  of  the 
same  dimensions.  The  same  cultural  operations  are  carried  out  over  the  entire  area. 
The  yield  of  each  plot  is  recorded  separately  at  harvest.  The  usefulness  of  the 
uniformity  trial  lies  in  the  fact  that  the  small  units  can  be  combined  into  larger 
plots  of  various  sizes  and  shapes  in  order  to  study  variability.  The  variation  in 
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yield  over  the  field  is  due  to  soil  heterogeneity,  as  well  as  to  plant  variation, 
errors  Inveighing,  etc.,  (generally  summed  up  as  experimental,  error).  The;  most 
obvious  use  for  the  data  is  to  provide  information  on  the  optimum  size  and  shape  of 
plot.  Uniformity  trial  data  can  also  he  used  to  compare  the  relative  efficiencies 
of  different  experimental  designs,  particularly  in  relation  to  a  certain  crop.  Data 
from  previous  uniformity  trials  may  also  he  use!  to  reduce  the  error  of  subsequent 
experiments  laid  down  on  the  same  plots. 

The  method  offers  promise  for  perennial  crops  where  the  same  plants  are  concerned, 
but  offers  little  or  no  advantage  for  annual  crops.  A  catalogue  of  uniformity  trial 
data  has  been  published  by  Cochran  (1937) « 

Some  agronomists  conduct  so-called  blank  trials  (planted  to  a  bulk  crop)  to  observe 
soil  heterogeneity  as  a  preliminary  step  in  experimentation  on  a  new  field.  Love 
(1928)  advocates  such  trials,  especially  as  a  preliminary  to  long-time  experiments, 
Th'ey  afford  an  opportunity  for  the  investigator  to  detect  good  and  poor  spots  on  a 
field  so  that  unsatisfactory  areas  may  be  eliminated.  One  objection  to  the  blank 
trial  used  in  this  manner  is  that  it  takes  time.  Time  may  be  an  Important  element  .in 
an  experiment . 

Ill .  Criteria  for  the  Measurement  of  Soil  Tari abi I  icy 

Some  accurate  measure  of  soil  heterogeneity  may  be  desirable  preliminary  to  seeps  for 
its  c  c  rre ct  i  on . 

(a)  Correlation  Coef f .;  cient 

Harris  (1915)  supplied  the  first  quantitative  measure  baseu  on  correlation-. 
his  heterogeneity  coefficient  being  an  intra-class  correlation  coefficient,  "dor  use 
of  the  formula,,  the  field  must  be  planted  uniformly  to  the  sains  crop  and  harvested  in 
small  units.  Harris  grouped  nearby  plots.  The  number  in  a.  group  was  arbitrary,  it 
being  common  to  use  2  by  1,  2  by  2,  and  2  by  3 -fold  groupings.  The  size  of  the 
heterogeneity  coefficient  is  influenced  by  the  size  of  group.  The  more  p] ots  that 
are  put  together,  the  greater  is  the  correlation  coefficient.  The  heterogeneity 
coefficient  is  expressed  on  a  relative  scale  from  0.0  -  1.0  so  that  comparisons  from 
field  to  field  can  be  made  directly.  This  coefficient  measures  the  degree  to  which 
nearby  plots  are  similar  in  productivity.  Should  the  correlation  be  sensibly  zero, 
the  Irregularities  of  the  field  are  not  so  great  as  to  influence  in  the  same  direc- 
tion the  yields  of  nearby  small  plots.  The  higher  the  correlation,  the  greater  the 
soil  heterogeneity.  One  may  grasp  the  significance  when  he  remembers  that  the  corre- 
lation coefficient  multiplied  by  100  gives  the  most  probable  percentage  deviation  of 
the  yield  of  an  associated  plot  when  the  deviation  of  one  plot  of  the  group  from  the 
general  average  is  known.  Hayes  and  Garber  (IQ27),  in  explanation,  state  that  in 
"patchy"  fields  certain  contiguous  units  tend  to  yield  high  while  others  sh.  w  a  ten- 
dency in  the  opposite  direction.  Under  these  conditions  a  high  correlation  coeffi- 
cient results.   "Where  variability  is  due  only  to  random  sampling  the  correspondence 
between  contiguous  plots  will  be  counter -balanced,  bv  lack  of  correspondence  in  others. 
The  same  result  can  be  obtained  with  the  ordinary  inter- class  correlation  coefficient 
as  with  the  heterogeneity  coefficient  when  a  2  by  1-fold,  arrangement  is  used,. 

The  analysis  of  variance  can  be  used,  to  obtain  the  same  result  as  with  the  hetero- 
geneity coefficient- as  ind.ica.ted  by  Fisher  (195*0  .   Intra-class  correlation  merely 
measures  the  relative  Importance  of  two  groups  of  factors  that  cause  variation.   In 
the  calculation  it  is  necessary  to  obtain  these  equalities: 

"1?  (x  -  x)2  =   (m  -  l)n  S2   -  - ..---.  (l) 

nS  (xb  -  x)*2  =  (m  -  1)b2    I  1  +  (n  •-  1)  r  1  •• (2) 


1 


L 
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vhere  m  =  the  number  of  arbitrary  blocks  and  n  =  the 
number  of  ultimate  units  within  a  block 


The  principal  value  of  the  correlation  coefficient,  either  inter-class  or  intra - 
class,  is  to  demonstrate  that  the  fertilities  of  adjacent  areas  are  correlated  and 
that  variability  exists  in  the  field. 

(b)  Fertility  Diagram 

The  suitability  of  a  particular  lay-out  adopted  in  an  experiment  can  be 
judged  to  a  considerable  extent  by  a  fertility  diagram  constructed  from  the  individ- 
ual plot  yields.  This  is  possible  from  uniformity  trial  data.  An  example  taken 
from  Crowther  and  Bartlett  (1938)  is  given  in  Figure  1. 


Figure  1 
Variation  in  natural  fertility  at  Bahtim,  193,+  (yield 
in  kantars  per  f eddan) . 


w 


IV .  Computation  of  Heterogeneity  by  the  Analysis  of  Variance 

Suppose  a  field  J  a  divided  into  K  email  plots*  all  sown  to  the  same  variety.  Some 
uniformity  trial  data  from  Mercer  and  Fall  (1911)  on  the  grain  yields  of  one  acre  of 
wheat  when  harvested  in  1/500-acre  plots  will  be  used  to  illustrate  the  method  of 
computation.  The  yields  in  pounds  per  plot  for  the  24  plots  in  the  northwest  corner 
are  as  follows : 
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It  is  noted  that  the  area  is  divided  into  an  arbitrary  number  of  blocks  all  equal  in 
size,  i.e.,  there  are  6  blocks  with  4  ultimate  plots  in  each. 

Let  x  =  the  value  of  an  ultimate  plot  unit. 
I\T  =  the  total  number  of  plots  =  24. 
S(x.)    =   sum  of  all  the  ultimate  plots  -   101.34. 
X  -  mean  yield  of  the  ultimate  plots  -  4.2308. 
3(x2)   =  sum  of  squares  of  yields  of  the  ultimate  plots  =  434-. 5582. 

Then,   the   sums   of   squares   are   computed  as  follows: 

Total  =  S(x2)    -   (J3x)f:       =        434.3382    -  422.5933  =  4.9394 

I 

Between  blocks  =  S(x£)      -(Sx)2  ±   1?'2T. 94-10   -  429.59-38  -  2.3864 

— : g;  jj 

Within  blocks  =  4.9594  -  2.3864  ^  2.5730 
The  analysis  of  variance  is  as  follows: 
Variation  D.F.  Sums  Squares 


Between  blocks  5  2.3864  =  (m-l)s2[  ].  j-  (n  -  1)  rj 

Within  blocks  13  2.5730  -   (m--l)s2  (n  -  l)  (1  -  r) 


Total  25  4.9594  =  (m-l)ns2 

Wow,  let  m  =  the  number  of  blocks,  m-1  =  the  degrees  of  freedom  for  blocks,  n  -  the 
number  of  plots  per  block,  and.  s'c-  =   the  estimated  variance. 

Then  m  =  6,  m  -  I  =  5>  and  n  =  4.  .  ' 

Since  (m-l)ns2  =  4.9594,  20s2  =  4.9594,  and  s2  =  0.2480 

From  the  formula  for  the  sum  of  squares  between  blocks, 
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(m  -  l)s2  [l  +  (n.-  1)  r]  *  2.386^ 
or  (5)(0.2480)(1  +  3r)  =  2.386^ 
Then  r  =  O.3082 

V.  Amount  of  Soil  Heterogeneity  > 

In  his  studios  of  soil  heterogeneity,  Harris  (1915) (1920)  used  fields  planted  to  the 
same  crppj  "but  harvested  in  separate  small  plot  units.  The  relative  productivity  of 
contiguous  plots  was  determined. 

(a)  Variations  in  Yield  in  Same  Season 

Some  of  the  results  obtained  by  Harris  are  given  by  Hayes  and  Garber  (1927): 

Plot  Size  Investigator  r 


Crop 

Characters 

Wheat 

Grain  Yield 

N  -Content 

Oats 

Grain  Yield 

Mangels 

Boots 

Loaves 

Potatoes . 

Tuber  Yield 

Corn 

Grain  Yield 

5-5  by  5.5  ft.       Montgomery     O.605  ±   0.029 

0.115  ±  O.oMf 
1/30  acre  Eiesselbach    OA95  ±  O.O55 

1/200  acre  Mercer  &  Hall  0.3^6  ±   0.037 

O.U66  ±  0.0U3 
12 -foot  row  Lyon  O.3H  f  0.0^3 

0.03;^  acre  Smith         O.83O  i  0.019 


The  amount  of  soil  heterogeneity  in  rod-row  trials  was  measured  by  Hayes  (1925)  at 
the  Minnesota  Station  in  connection  with  a  variety  test.  Four  systematically  distri- 
buted plots  were  used.  To  obtain  the  heterogeneity  coefficient  the  average  yield  of 
each  strain  in  the  trial  was  considered  as  100.  The  yielding  ability  of  each  plot 
was  obtained  by  dividing  its  actual  yield  by  the  average  yield  of  all  four  replicates 
and  expressing  the  result  in  percentage.  By  the  ordinary  method,  correlations  in 
yielding  ability  of  adjacent  plots  or  of  plots  at  any  distance  apart  were  determined, 
The  results  were  as  follows  for  oats;,  spring  wheat,  and  winter  wheat: 

Correlation  Coefficients  (r) 
Factors  Correlated Pi1^ Spring  Wheat Winter  Wheat 

Adjacent  plots  0.572  ±   0.025  0.6l8  ±  0.023  0.552  ±   0.063 

Separated  by  one  plot  0A90  -  0.029  0.518  ±  0.028  O.293  ±  0.028 

Separated  by  four  plots  0.26^  ±  O.oUl  0 .kk-9   -  0.03^  O.llU  *  0.118. 

Separated  by  ten  plots  0.275  -  0.057  0A29  ±  0.060        - --- 


The  correlation  coefficient  explains  very  little  unless  one  knows  the  factors  in- 
volved. However,  it  affords  the  best  means  to  consider  the  amount  of  replication 
that  should  be  practiced. 

Similar  results  were  obtained  by  Garber,  Hoover,  and  Mcllvaine  (1926)  in  West  Vir- 
ginia experiments.  They  found  a  marked  correlation  between  the  yields  of  oat  hay  in 
contiguous  plots.  The  correlation  for  the  yields  of  replicated  plots  was  sensibly 
zero . 

(b)  Permanence  of  Differences 

It  is  important  to  know  whether  or  not  there-  is  a  tendency  for  plots  that 
produce  low  yields  one  season  to  produce  low  yields  the  next  season,  etc.  The  re- 
sults of  Harris  and  Scofield  (1920)  indicate  a  tendency  for  plots  to  yield  in  a 
similar  manner  from  year  to  year,  altho  there  are  some  exceptions.  Their  data  for 
inter-annual  correlations  for  hop  yields  are  as  follows: 
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series 


1909 
1910 
1911 
1912 


1st  and  2nd 
Years 


1st  and  3rd    1st  and  ^th   1st  and  5th   1st  and.  6th 
Years        Years        Years       Years 


0.580 


O.768  i  0.051     0.662  -  0.07 

0.577  ±  o.o32    0.447  t  0.099  .  0.451  -  0.098    0.27^ 
0.3.15  *  0.111    0.126  $  0.121 


0.105     0.259  *0..115 
O.IIJ4 


0.061  -  0.12*; 


0.062  x  0.123 

0.511  i  0.111  0.703  -  0.062 


0.597  ±  0.079 


the  result: 


5 -year  study  on 


In  a  later  paper,  Harris  and  Schofield  (.1928)  gt 
a  uniform  cropping  experiment  at  Hunt ley,  Montana.   In  general,  a  positive  correla- 
tion 'between  the  yields  of  a  series  of  plots  was  found  thruout  a  period  of  years. 
The  plots  which  show  a  heavier  yield  one  year  will  in  general  show  heavier  yields  In 
other  years  during  the  perior  under  investigation.  Under  some  conditions  negative 
correlations  -were  found  which  were  interpreted  as  indicating  the  importance  of  a 
preceding  crop  in  determining  the  characteristics' of  an  experimental  field. 


Garber,  et  al,   (1926)  found  some  tendency  for  plots  which  produced  relatively  high 
yields  of  oat  hay  in  IO23  to  produce  relatively  high  yields  of  -wheat  grain  in  192''-. 


The  correlation  coefficient  for  th  ;  two 


was  0.564  *  0.056.  The  study  was  con- 


tinued by  Garter  and  Hoover  (1930)  to  determine  whether  or  net  the  natural  variation 
in  noil  productivity  among  plots  as  revealed  "by  a  crop  uniformity  test  persisted  af- 
ter an  experiment  is  started  that  involves  different  crops  and  different  soil  treat- 
ments. They  correlated  the  relative  yields  from  duplicate  oat  plots  in  1923  and  the 
relative  average  yields  from,  the  same  duplicate  plots  of  other  crops  in  a  rotation 
experiment  from  1924  to  1029  (incl,).  The  data  were  as  follows: 
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130 
150 

1.26 
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120 
15o 


O.38  i  0.05 
0.35  *  0.05 
0.48  *  0.05 
0.41  *  0.05 
0.42  ±  0.05 
C.27  t   0,03 


age  1923  to  1929  (incl.) 
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'hese   correlation  coefficients  were  all  statistically  significant,    and 
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Indirectly  influence  the  variation 


in  soil  productivity.  Steep  hillsides  are  unadapted  to  e 

rains  gully  the  field  and  carry  the  fertilisers  from  plot 

is  apt  to  pond  on  certain  area.s  and  Influence  crop  yields 

troduced  l>:j   variation  in  the  subsoil.  For  example,  there  are  gravel  pockets  in  the 

subsoil  on  the  Judith  Basin  (Montana)  field  station. 


perimentation  because  heavy 
to  plot .  Moreover,  water 
Sometimes  errors  are  in- 
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(b)  Soil  Moisture 

The  water  content  of  soil  was  studied  by  Harris  (19-15)  on  the  U.S.D.A.  Exper- 
imental Farm  at  San  Antonio  (Texas) .  He  took  "borings  6  feet  deep  at  20-foot  inter- 
vals on  a  field  150  "by  26h   feet  in  size.  The  coefficients  ranged  from  r  =  +0.32  to 
0.70,  being  statistically  significant  for  each  foot  section  of  the  upper  6  feet  of 
soil. 

(c)  Fertility  Elements 

The  carbon  and  nitrogen  content  of  soils  was  studied  by  Harris  (1915)  at 
Davis,  California.  The  heterogeneity  coefficient  for  carbon  was  O.U17  *  O.063,  while 
that  for  nitrogen  was  0.^9^  -  0.057 •  On  blow  sands  at  Oakley,  the  r-value  for  carbon 
was  O.3I7  ±  0.068,  and  that  for  nitrogen  was  O.230  ±  0.072.  Wide  fluctuations  in 
nitrate  nitrogen  were  reported  by  Blaney  and  Smith  (1931)  on  l/30  acre  plots.  They 
found  that  the  probable  error  was  usually  greater  than  5  Pe^  cent  where  less  than  20 
soil  cores  were  considered.  In  fact,  they  recommended  50  soil  samples  on  a  1/30- 
acre  plot  to  reduce  the  error  to  approximately  5  V0T   cent  of  the  mean.  When  soils 
outside  of  Rhode  Island  were  considered,  they  found  that  6  to  8l  samples  were  neces- 
sary to  obtain  a  probable  error  that  low.  Some  Colorado  Station  data  show  extremely 
wide  fluctuations  in  p. p.m.  nitrate  nitrogen  on  an  irrigated  soil.  A  13  by  10-foot 
plot  was  sampled  in  5  places  to  a  depth  of  6  feet.  The  nitrate  nitrogen  varied  from 
5  to  35  p. p.m.  on  this  small  area.  It  is  obvious  that  variations  in  nitjrate  nitrogen 
can  cause  yield  differences  from  area  to  area. 

VII.  Corrections  for  Soil  Variability 

Once  soil  heterogeneity  is  recognized,  some  means  must  be  obtained  to  avoid  or  cor- 
rect its  influence  in  field  experiments.  A  decrease  in  size  of  plots  and  an  increase 
in  the  number  of  replications  (as  will  be  shown  later)  has  been  the  general  practice 
to  overcome  soil  variation.  The  repetition  of .plots  of  varieties  or  treatments  to 
be  tested  against  each  other  are  scatter*!  out  so  that  they  may  sample  the  different 
conditions  of  the  trial  area.  One  variety,  for . instance,  may  be  grown  partly  on_ 
favorable  portions  and  partly  on  less  favorable  portions.  This  usually  means  that 
the  variety  encounters  somewhere  near  e-verage  soil  conditions.  Efficient  experimen- 
tal designs  provide  for  the  removal  of  a  portion  of  variability  due  to  soil.  Arti- 
ficially constructed  field  plots  were  studied  by  Garber  and  Pierre  (1933)  over  a  3- 
year  period.  They  found  that  soil  heterogeneity  was  largely  removed. by  a  thorough 
mixture  of  soil  placed  in  30  artificial  bins.  These  soil  bins  were  9  feet  k   inches 
by  k   feet  8  inches  (inside  area)  by  2k   inches,  in  height,  and  ye^e   0.001-acro  in  area. 
They  obtained  a  probable  error  of  a  single  determination  in  per  cent  of  the  mean  of 
3A  for  soybean  hay,  and  6.2  for  wheat.-  They  found,  however,  that  the  variation  in 
crops  was  still  too  high  to  make  replication  unnecessary. 

VIII.  Relation  to  the  Experimental  Field 

Many  early  experimental  fields  were  poorly  selected  because  of  the  belief  that  an 
experimental  farm  should  contain  many  different  soil  types,  i.e.,  the  soil  should  be 
extremely  heterogenous.  The  Ohio  Experiment  Station  was  allowed  to  relocate  after 
the  first  10  years  due  to  the  poor  choice  of  the  original  site.   (Thorne,  1909)-  For 
all  ordinary  field  experiments  the  land  should  be  as  uniform  as  possible  in  regard  to 
topography,  fertility,  subsoil,  and  previous  soil  management.  However,  extreme  uni- 
formity may  defeat  the  purpose  of  the  investigator  unless  such  soil  is  representative 
of  the  area  for  which  the  results  are  to  apply. 

(a)  Topography 

A  perfectly  level  piece  of  land  is  as  undesirable  for  field  experiments  as 
one  with  surface  inequalities  because  water  may  pond  on  it.  A  slope  of  1  or  2  per 
cent  will  permit  water  from  heavy  rains  to  flow  off  uniformly  and  completely.  A 
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slight  slope  is  highly  desirable  on  land  to  be  irrigated.  Some  experimenters  use 
low  land  or  "draws"  and  irregular  areas  for  bulk  crops  or  for  seed  increase  plots. 

(b)  Previous  Soil  Treatment 

It  is  desirable  to  have  soils  which  have  had  uniform  previous  treatment  be- 
cause there  may  be  a  carry-over  effect  of  previous  treatments.  According  to  the  ' 
American  Society  of  Agronomy  standards  (1933):   "When  a  field  or  series  of  plots  has 
been  occupied  by  varietal  or  cultural  tests  of  such  a  nature  as  to  seriously  increase 
soil  variability,  one  or  more  uniform  croppings  should  intervene  (or  follow)  before 
it  is  again  used  for  such  tests.   It  is  frequently  helpful  to  arrange  the  plots  at 
right  angles  to  the  direction  of  the  previous  plots." 

(c)  Subsoil  Conditions 

When  it  is  necessary  to  drain  lands  in  the  humid  regions,  the  tile  lines 
should  be  located  so  as  to  influence  all  plots  alike.  They  should  run  across  the 
plots  rather  than  with  them.  In  the  case  of  soil  fertility  experiments,  it  is  recom- 
mended that  a  soil  profile  be  taken  to  a  depth  of  3  feet  for  each  series  of  plots. 
Before  soil  treatment  experiments  are  begun,  representative  samples  of  the  soil  and 
subsoil  should  be  carefully  taken  for  such  analyses  as  may  be  desired  for  future 
reference. 
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Questions  for  Discussion 

1.  Discuss  how  soil  heterogeneity  might  influence  yield  trials. 

2.  Why  and  how  may  small  plots  overcome  the  influence  of  soil  variation? 

3.  What  is  a  uniformity  trial?  How  conducted? 

k.   What  uses  can  be  made  of  uniformity  trial  data? 

5.  How  can  correlation  "be  used  to  measure  soil  heterogeneity? 

6.  Fundamentally,  what  is  the  so-called  "heterogeneity  coefficient"  used  "by  J. 
Arthur  Harris?  How  interpreted? 

7.  What  evidence  did  Harris  have  that  soil  heterogeneity  was  universal? 

8.  What  general  results  were  obtained  at  the  Minnesota  Station  when  the  yields  of 
adjacent  plots  were  correlated?  Those  separated  "by  other  plots? 

9.  Are  differences  in  the  productivity  of  plots  constant  from  year  to  year? 
Explain. 

10.  How  may  soil  topography,  moisture,  and  nitrogen  account  for  soil  heterogeneity? 

11.  What  corrections  can  he  used  for  soil  variability? 

12.  Would  artificial  soil  bins  do  away  with  the  need  for  replication?  Explain. 

13.  What  precautions  should  be  taken  in  the  selection  of  an  experimental  field? 
1*.  Is  extremely  uniform  soil  always  desirable  for  experimental  work?  Explain. 

15.  What  is  the  value  of  a  bulk  crop  preceding  an  experiment? 

16.  To  what  use  would  you  put  uneven  and  low  land  in  an  experimental  field?  Why? 


Problems 

One  acre  was  planted  uniformly  to  the  same  variety  of  wheat  and  harvested  in  units 
l/500-acre  in  size.   (Data  from  Mercer  and  Hall).  The  16  plots  in  the  southwest 
corner  of  the  acre  gave  yields  in  pounds  as  follows : 
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(a)  Calculate  the  correlation  coefficient  by  the  analysis  of  variance  for  a  1  by 
2 -fold  arrangement. 

(b)  Calculate  the  simple  correlation  coefficient  for  the  same  paired  values. 

2.  Some  unpublished  data  from  the  Akron  Field  Station  give  the  average  yields  of  corn 
and  oats  (combined)  in  bushels  for  a  particular  piece  of  land  for  20  years  as 
follows: 
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Totals   I89.8       205.1       196.2       177.1 

(a)  Calculate  the  correlation  coefficient  by  the  analysis  of  variance  to  determine 
the  heterogeneity  from  north  to  south,  i.e.,  for  a  1  "by  k   combination. 

(b)  What  is  the  correlation  coefficient  for  a  west  to  east  direction?  Calculate 
r  for  1  by  5  combinations. 

.(c)  In  what  direction  is  the  soil  most  variable?  Why? 

(d)  Assume  that  the  yield  for  each  plot  is  40  bushels  in  the  above  problem  2. 
Calculate  the  correlation  coefficient  by  the  analysis  of  variance. 

5-  Some  yields  of  wheat  plots  of  a  single  variety  grown  in  10  by  10-foot  plots  were 
as  follows:   (Pat a  from  Montgomery) . 
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Calculate  the  heterogeneity  coefficient  (intra-class  correlation)  by  the  analysis 
ef  varian.ce  for  a  2  by  1-fold  combination  (2  horizontal  rows  and  1  vertical  row) . 


CHAPTER  XIII 
SIZE,  SHAPE  AND  NATURE  OF  PLOTS 

I.  Early  Use  of  Field  Plots 

Modern  field  experiments  "began  in  183^  when  Jean  Boussingault  started  a  series  of 
tests  on  his  farm  near  Bechelbronne  in  Alsace.  Early  agriculture  investigators 
favored  large  plots  because  of  their  attempts  to  conduct  field  trials  in  essentially 
the  same  manner  as  the  farmer  handled  his  crops.  ' 

The  size  of  plot  was  considered  at  the  Virginia  Experiment  Station  as  early  as  1890 
by  Alwood  and  Price  (1890)  who  suggested  that,  within  limits,  the  larger  the.  plot  the 
more  reliable  the  results.  However,  they  conceded  that  small  plots  were  sufficiently 
accurate  for  preliminary  trials  and  for  obtaining  information  on  earliness  and  gener- 
al quality  of  varieties.  Taylor  (I908)  found  a  wide  variation  in  size  of  plots  used 
in  this  country  in  1908.  They  varied  from  two  acres  in  a  Georgia  cotton  experiment 
to  l/^0 -acre  in  size,  with  all  sizes  between  the  two  extremes.  The  average  size  of 
plot  in  America  at  that  time  was  l/lO-acre . 

The  size  of  plots  in  relation  to  the  experimental  error  was  first  studied  at  the 
Rothamsted  Experimental  Station  in  1910  by  Mercer  and  Hall  (1911).  As  a  result  of 
their  work  and  that  carried  on  subsequently  by  others,  the  trend  has  been  toward 
smaller  plots  and  increased  replication.  A  questionnaire,  sent  out  by  the  Committee 
on  the  Standardization  of  Field  Experiments  of  the  American  Society  of  Agronomy  in 
1913,  reflected  this  tendency.  The  plot  sizes  used  by  different  agronomists  /aried 
in  size  from  one  acre  to  l/200-acre,  with  very  few  using  plots  larger  than  l/lO-acre 
or  less  than  l/80-acre. 

At  the  present  time,  plot  sizes  vary  from  l/lO  to  l/lOOO-acre  in  size.  The  basis  for 
the  smaller  plots  with  increased  replication  has  been  data  from  various  blank  or  uni- 
formity trials  conducted  by  Mercer  and  Hall  (1911),  Bay  (1920),  Summerby  (1925), 
McClelland  (1926),  Wiebe  (1935),  Smith  (1938)  and  many  others.  The  catalogue  by 
Cochran  (1937)  should  be  consulted  for  uniformity  trials  with  specific  crops.  • 

A  —  Size  and  Shape  of  Plots 

II.  Factors  that  Influence  Plot  Size 

There  are  several  factors  to  consider  in  plot  size  aside  from  the  accuracy  of  the 
results.  Some  of  these  are:  Kind  of  crop,  number  of  varieties  or  treatments,  kind 
of  machinery  to  be  used  on  them,  and  the  amount  of  land,  labor,  and  funds  available 
for  the  tests.   (1)  Kind  of  Crop:  It  is  the  general  practice  to  use  larger  plots  for 
corn,  sugar  beets,  and  the  forage  plants  than  for  small  grains.  The  plots  must  be 
large  enough  to  carry  a  representative  population  of  the  crop  involved.   (2)  Number 
of  Varieties  or  Treatments:  Small  plots  are  a  necessity  when  large  numbers  of  varie- 
ties or  strains  are  in  various  testing  stages.  In  small  grains,  it  is  not  uncommon 
to  have  from  500  to  20,000  strains  in  the  various  stages  of  a  breeding  program. 
(3)  Amount  of  Seed:  In  the  early  years  of  selection  in  small  grains  and  in  many 
other  plants,  only  a  very  small  amount  of  seed  is  usually  available.  Obviously,  the 
plots  must  not  be  too  large  for  the  seed  supply,  (k)   Kind  of  Machinery:  The  area 
and  shape  of  field  plots  should  be  such  as  to  enable  the  operation  of  standard  farm 
machinery  and  to  reduce  to  a  reasonable  minimum  the  errors  concerned  therewith. 
Larger  plots  are  necessary  when  the  crop  is  planted,  cultivated,  and  harvested  with 
standard  farm  machinery  than  where  hand  methods  are  used.  (5)  Land  Area:  For  a 
given  area  of  land,  the  plot  size  varies  inversely  with  the  number  of  varieties  or 


treatments  to  be  included.  This  is  true  until  the  minimum  practical  size  is  reached. 
As  a  result,  to  quote  Goulden  (1929)'.   "The  general  practice  is  to  use  quite  small 
plots  adequately  replicated  for  strain  tests,  i.e..  when  there  are  a  large  number  of 
varietal  units,  and  larger  plots  when  the  number  of  varieties  is  small  enough  to  per- 
mit their  use  with  the  amount  of  land  available."  (6)  Funds  Available :   In  general, 
it  is  more  costly  bo  use  large  plots  than  siaa.ll  plots. 

III.  Kinds  of  Experimental  Plots 

It  is  necessary  to  distinguish  between  nursery  and  field  plots  more  or  less  arbitrari- 
ly. Nursery  plots  are  usually  •  small  plots  cared  for  by  hand  while  field  plots  are- 
larger'  and  adapted  to  the  use  of  standard  farm  machinery.  The  present  tendency  is  to 
reduce  the  size  of  field  plots  and  to  enlarge  nursery  plots  from  single  to  multiple 
short  rows  (rod-rows  in  many  cases) . 

(a)  Nursery  Plots 

Nursery  plots  may  be  as  small  as  one  square  yard  in  area,  but  the  rod-row  is 
probably  the  most  common  unit  size.  Small  plots  allow  the  preliminary  testing  of 
many  strains.  However,  uniform  soil  and  careful  t'echnic  is  vital  to  accuracy  for 
small  plots.  Taylor  (I908)  points  out  that  small  mistakes  on  small  plots  may  greatly 
modify  the  results.  For  example,  an  error  of  5  pounds  on  a  1/20-acre  plot  would  mean 
an  error  of  100  pounds  on  an  acre  basis.  The  rod-row  unit  has  been  widely  used  in 
this  country  for  small  grain  trials  while  the  chessboard  plot  has  been  used  in  Eng- 
land. Engledow  and  Yule  (I.926)  describe  the  latter  as.  being  one  yard  square  with  the 
crop  space-planted  at  2  by  6  inches.  The  principal  objection  to  the  chessboard  is 
the  amount  of  detailed  hand  labor  involved  and  the  fact  that  it  affords  less  oppor- 
tunity to  observe  strength  of  straw,  evenness  of  germination,  etc.  As  plant  individ- 
uality must  be  considered  in  row  crops,  there  is  some  variation  in  type  of  nursery 
plots. 

(b)  Field  Plots 

For  standard  farm  machinery,  field. plots  usually  vary  from  1/10  to  1/100  acre 
in  size.  They  offer  more  opportunity  to  observe  crop  behavior  under  conditions  com- 
parable to  those  found  on  the  farm.  Field  plots  are  used  for  variety  tests,  crop 
rotation  experiments,  fertilizer  trials,  forage  experiments,  pasture  experiments, 
irrigation  studies,  cultural  trials,  etc.  0rd.ina.rily,  such  plots  are  long  and  narrow 
in  shape  as  most  convenient  for  farm  machinery. 

( c )  Comparison  of  Nursery  vs.  Fi e Id  PI ots 

The  use  of  small  hand-sown  nursery  plots  to  test  yields  of  agricultural  crop 
varieties  has  been  frequently  criticised,  on  the  ground  that  such  plots  do  not  repre- 
sent normal  agricultural  conditions.   In  general,  small  plots  have  been  found  to  com- 
pare favorably  with  large  field  plots  in  accuracy  so  long  as  adequate  precautions 
have  been  taken  against  competition  and.  other  errors.  There  is  further  evidence  that 
nursery  plots  give  results  that  are  valid  when  applied  to  agricultural  practice. 

As  early  as  1910,  Lyon  (I9IO)  reported.,  a  comparison  of  seven  l/l0-acre  plots 
with  seven  groups  of  10-row  plots  "I"   feet  long.  The  probable  errors  were  p.09  and. 
k.k'-);.    respectively.  Moreover,  less  land  was  required  for  the  small  plots.  Seven 
l/lO-acre  plots  covered  an  airea  of  30,1+92  square  feet,,  while  70  of  the  17-foot  rows 
required  only  1,190  square  feet  in  area. 

A  general  correspondence  of  rod-rows  and.  field  plots  has  been  shown  by  Klages 
(1933)  for  11  to  1^  varieties  of  spring  wheat,  7  varieties  of  durum  wheat,  12  to  15 
varieties  of  oats,  13  to  20  varieties  of  barley,  and  7  varieties  of  flax  in  each  of 
k   years.  He  calculated  the  correlation,  coefficients  (r)  for  the  two  sets  of  plots. 
Hayes  and  others  (1932)  compared,  the  yields  of  16  wheat  varieties  sown  in  rod  rows  by 
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hand  and  by  a  drill  at  different  rates  with  those  aown  by  a  farm  drill  in  l/UO-acre 
plots.  .  The  correlation  coefficients  indicate  some  agreement  between  the  yields  ob- 
tained from  the  small  and  large  plots.  Smith  (1936)  has  criticized  the  correlation 
coefficient  as  inefficient  in  such  comparisons:   "If  real  differences  between  varie- 
ties were  either  small  or  non-existent,  then  the  correlation  coefficients  would  be 
zero  or  insignificant,  altho  the  trials  might  agree  in  showing  no  significant  differ- 
ences between  them.  On  the  other  hand,  the  correlation  coefficient  could  not  become 
unity  unless  experimental  error  could  be  entirely  eliminated.  Consequently,  r  may 
vary  from  0  to  +  1  even  while  the  two  forms  of  trial  are  in  perfect  agreement." 

In  a  study  of  12  timothy  varieties,  Smith  and  Myers  (193*0  showed  that  the 
yields  from  rod-rows  and  l/50-acre  field  plots  agreed  to  precisely  the  degree  required 
by  statistical  theory.  Smith  (193^)  later  compared  9  wheat  varieties  sown  by  a  farm 
drill  in  l/lOO-acre  plots  and  dibbed  in  square  yard  plots.  Agreement  of  the  two  ex- 
periments was  excellent  with  respect  to  yield  of  grain.  Tysdal  and  Kiesselbach  (1939) 
compared  2  varieties  of  alfalfa  in  l/30-acre,  field  plots  with  various  l6-foot  nursery 
plots  which  differed  as  to  number  of  rows  and  spacing.  They  combined  the  forage 
yields  into  a  single  analysis  of  variance  from  which  they  concluded  that  the  several 
types  of  nursery  plots  gave  essentially  the  same  yields  of  the  two  varieties  as  did 
the  field  plots.  The  interaction  of  varieties  x  type  of  plot  was  not  significant. 

The  problem  resolves  itself  into  whether  small  nursery  plots  with  more  pre- 
cise control  of  soil  heterogeneity  will  give  the  same  results  as  large  field  plots 
with  less  control  of  soil  variability.  The  sacrifice  in  plot  (sample)  size  must  be 
balanced  l>j  more  effective  control  of  soil  heterogeneity  for  the  small  nursery  plot 
to  be  ae  satisfactory  as  the  large  field  plot.  This  can  be  brought  about  to  some  ex- 
tent by  increased  replication  of  small  plots. 

IV.  Relation  of  Plot  Size  to  Accuracy 

In  general,  it  has  been  found  that  the  variability  is  decreased  as  the  plot  is  in- 
creased in  size  up  to  about  l/lO-acre.  However,  the  variability  is  less  when  a  unit 
of  a  certain  area  is  made  up  of  several  distributed  units  than  when  a  single  large 
unit  is  used.  In  a  theoretical  discussion,  Siao  (1935)  states  "Increasing  the  size 
of  plot  decreases  the  variability  of  the  experiment  by  increasing  the  precision  of  a 
single  plot  yield.  On  the  other  hand,  there  is  an  increase  in  the  variability  within 
the  block  through  expanding  the  area  included  in  the  block.  There  are  two  opposing 
tendencies  that  affect  the  experimental  error  as  the  plot  changes  in  size,  the  final 
result  being  due  to  a  balance  between  these  two  tendencies.  The  slow  rate  of  reduc- 
tion in  experimental  error  through  increase  in  size  of  plot  and,  in  exceptional  cases, 
the  greater  variability  for  larger  plots,  may  be  explained  by  increase  in  variation 
within  the  block  as  the  plot  increases  in  size."  The  work  of  Stadler  (1921)  and 
Wiebe  (1935)  indicates  that  the  total  variation  tends  to  increase  as  more  land  is 
added  to  the  experimental  area,  provided  the  size  and  shape  of  the  ultimate  unite 
remains  the  same.  It  should  be  emphasized  that  plot  size  varies  with  the  conditions 
of  the  experiment,  there  being  no  one  size  best  for  all  crops  on  all  soils.  Compara- 
tive studies  on  plot  size  have  been  carried  out  in  most  instances  on  blank  or  unifor- 
mity tests.  After  optimum  plot  size  has  been  determined,  the  standard  error  per  plot 
and  the  number  of  replications  to  reach  a  given  degree  of  accuracy  in  the  comparison 
of  the  mean  treatment  yields  is  usually  computed.  Typical  investigations  on  plot 
size  will  be  considered  for  small  grains  and  for  other  crops  separately. 

(a)  Small  Grain  Plots 

Much  of  the  earlier  work  was  conducted  with  small  grains .  The  conclusions 
applicable  to  one  are  generally  .applicable  to  the  others.  Mercer  and  Hall  (19H) 
used  uniformity  trial  data  for  an  acre  of  wheat,  the  field  being  divided  into  500 
small  plots  each  of  which  was  harvested,  separately.  Adjacent  plots  were  grouped  so 
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as  to  form  plots  of  different  sizes.  The  standard  deviations  in  per  cent  por  the 
1/500,  1/250,  1/125,  l/l30,  1/25,  and  l/lO.-acre  plots  were  11.-6.,  10.0,  8.0,  6.5.,  p,7, 
and  5-l.j  respectively.  The  standard  deviation  was  reduced  as  the  plots  were  ioade 
larger,  hut  the  increase  in  plot  size  above  l/50-acre  produced  a  relatively  small 
decrease  in  variability.  These  investigators  found  that  precision  was  increased  more 
rapidly  by  replication.  When  five  scattered  l/300»acre  plots  were  combined  so  as  to 
give  a  total  area  of  l/lOO-acre  the  standard  deviation  in  per  cent  of  the  mean  was 
reduced  to  k.S   per  cent.  Olmstead  (IQl'-t-)  found  with  wheat  that  a  number  of  small 
plots  ranging  down  to  0.0007-acre  in  size  is  much  better  than  the  same  total  area  in 
one  plot,  and  also  that  one  large  plot  is  more  accurate  than  one  small  one.   In  wheat 
studies,  Day  (1920)  found  that  the  probable  error  decreased  with  an  increase  in  plot 
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with  wheat,  Smith  (193$)  concluded  that  the 
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Hayes   (I923)    compared  16  and  32 -foot   rows   of  wheat,    oats,    and  barley.     He  failed  to 
find  a  significant  difference   in  favor  of  32-foot   rows.     A   comparison  of   one  and  two 
rod-rows  plots   indicated  little  advantage  for  the  harvest   of  two  rod-row     per  plot 
over  one.     Stadler   (1921)    obtained  data  on  three  and  five-row  plots,    the  border  rows 
being  discarded.     His  results  follow: 
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Summerby    (1925)    found  very  little  difference   in  accuracy  between  large  and  small 
plots  when  eight  replications  were  used.     His   oat  plots  were  1,2,^,8,16,    and.  32  rows 
in  width,    spaced  one  foot,    and  15  feet   long.      Love  end.  Craig  (1938)   made  an  analysis 
of  data  from  2  oat   crops  for  various  types   of  plots  and  various  numbers   of  replica- 
tions,   and  for  rows   15  and  30  feet    in  length.     The  data  indicate  that  ^-row  plots 
with  several  replications   (8  or  10),   when  all  rows  are  harvested,    give  accurate  re- 
sults.    They  are  preferred  to  single -row  plots.     The  15-foot  rows  were  considered 
more   satisfactory  than  those  30  feet   in  length.      Such  data   as  these   support  the  wide- 
spread practice   of  using  three  rod-row   plots  for  small  grain  nursery  trials  with  the 
center  row  harvested  for  yield. 

(b)  Other  Crops 

Different  crop  plants  are  known  to  differ  in  variability.  The  coefficients 
of  variability  for  different  crops  were  compared  by  Smith  (1933)  for  a  standard 
l/^-0-acre  plot  from  the  published  data  for  39  uniformity  trials.  The  crops  fell 
roughly  into  3  groups:   (I)  wheat,  mangolds,  sugar  beets,  soybeans,  and  sorghums 
(forage)  seem  to  be  less  variable:  (2)  corn,  potatoes,  cotton,  and  natural  pasture 
were  intermediate;  and  (3)  fruit  trees  were  most  variable. 

In  the  ca.se  of  corn,  Bryan  (1933)  reports  that  "variability  of  plot  yields 
decreased  as  the  size  of  plots  increased  from  3  to  lb,  to  2k}    and  to  kS   hills,  but 
the  decrease  was  not  proportional  to  the  size  of  plot ,  The  experimental  error  for  a 
given  area,  therefore,  would  be  lower  with  larger  numbers  of  small  plots."  McClel- 
land (1926)  obtained  a  similar  reduction  in  error  as  the  size  of  plot  was  increased, 
the  error  being  11.2  per  cent  for  l/80-acre  plots,  and  6.2  per  cent  for  those  1/2 
acre  in  size. 


With  sorghums,  Stephens  and  vrinall  (1928)  concluded  that  the  errors  decrease 
with  an  increase  in  plot  size  up  to  l/20-acre.   Increasing  the  plot  from  l/oOO-acre 
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to  l/20-acre,  with  the  same  total  area  concerned,  reduced  the  probable  error  about 
60  per  cent. 

The  standard  error  was  found  "by  Immer  (1933)  to  "be  actually  reduced  in  sugar 
"beet  plots  when  the  plot  size  was  increased  from  one  to  two  rows  in  width,  or  for  an 
increase  in  length  from  two  to  four  rods.  However,  efficiency  in  the  use  of  land  de- 
creased as  the  size  of  plot  was  increased.  Some  of  his  data  for  the  harvest  of  the 
entire  plot  are  as  follows : 

Length  Plot  Percentage  Efficiency  of  Plots  of  Indicated  Width  (Rows) 

in  Pods  1         2  3         ^        6  12 


2 

100.0 

88.0 

77.7 

53.3 

3^-9 

21. h 

k 

76.2 

62.5 

kB.2 

35.2 

21.2 

28.8 

10 

50.0 

37.6 

28.6 

26.1 

10.2 

9A 

20 

35.1 

2*. 5 

21.6 

10.1 

5.8 

6.7 

Similar  results  were  reported  by  Immer  and  Raleigh  (1933)« 

Uniformity  trial  data  with  soybeans,  computed  by  Odland  and  Garber  (1928), 
indicate  that  16-foot  plots  in  single  rows  replicated  three  times  were  the  most  satis- 
factory when  both  accuracy  in  results  and  land  economy  were  taken  into  account. 

Vest over  (192*0  experimented  with  220  rows  of  potatoes,  150  feet  long.  He 
harvested  them  in  10-foot  lengths,  and  found  a  sharp  reduction  in  probable  error 
between  row  lengths  of  10  and  Uo  feet.  Beyond  60-foct  lengths,  there  was  very  little 
reduction  in  error. 

Ligon  (1930)  found  no  necessity  for  rows  greater  than  100  feet  in  length  for 
cotton,  the  shorter  rows  being  just  as  accurate  when  sufficiently  replicated.  Unit 
rows  of  cotton  2k   feet  long  and  spaced  one  foot  apart  were  used  by  Siao  (1935)  in 
studies  on  size  of  plot  for  cotton.  When  combined  into  plot  sizes  of  1,2, 3 }k,   and  3 
rows,  the  efficiency  was  greatest  for  the  smallest  plot. 

In  plot  size  studies  with  millet,  Li  and  others  (193&)  concluded  that  plots 
15  feet  long  and  two  rows  wide  were  the  most  efficient,  i.e.,  113-9  per  eent  compared 
to  100  per  cent  for  15-foot  plots  one  row  wide. 

Batchelor  and  Reed  (1918)  studied  the  variability  of  orchard  plot  yields  from 
the  standpoint  of  increasing  the  number  of  adjacent  trees  per  plot.  The  average  re- 
duction in  variability  for  all  fruits  was  37.78  to  2^.27  per  cent  when  the  plot  was 
increased  from  one  to  eight  trees,  but  little  was  gained  by  including  16  to  2k   trees 
per  plot . 

The  reasons  for  variability  in  small  plots  may  be  summarized  as  follows: 
(1)  Variability  in  soil,  (2)  losses  in  harvest  and  errors  in  measurement  have  a  rela- 
tively great  effect,  (3)  in  row  crops,  plant  variability  may  be  important  because  of 
fewer  plants,  (k)   competition  and  border  effects  are  apt  to  be  greater  on  small  plots. 

V.  Plot  Sizes  for  Various  Crops 

The  plot  sizes  depend  upon  the  crop  plant,  and  upon  the  conditions  under  which  the 
test  is  conducted.  (1)  Small  Grains:  The  majority  of  experiment  stations  use  three- 
row  plots  with  the  center  row  harvested  for  yield,  but  a  few  use  five-row  plots  with 
the  center  three  rows  used  for  the  yield  determination.  A  few  use  single-rod-row 
plots.  (2)  Corn:  The  Nebraska  station  uses  four-row  plots,  12  hills  long,  harvesting 
the  center  rows  for  yield.  Others  use  single  rows  about  20  hills  long,  or  three  rows 
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of  the  same  length  with  only  the  center  one  harvested  for  yield.  Bryan  (1933)  re~ 
ports  that,,  in  a  comparison  of  open-pollinated  varieties  and  hybrids,,  equal  degrees 
of  precision  were  attained  with  about  half  as  many  plants  or  hills  of  crosses  as  of 
open-pollinated  varieties.  He  found,  that  '43  total  hills  were  sufficient  to  represent 
a  variety.   (3)  Soybeans :  Soybeans  may  be  grown  in  rows  lo  feet  long  and  30  to  32 
inches  apart.  Field  plots  are  often  employed,   (4)  Sorghums :  The  work  of  Stephens 
and  Vinall  (1923).  indicates  that  "three  02'  four  replications  of  l/40-acre  or  l/80- 
acre  plots  will  give  results  sufficiently  reliable  for  the  ordinary  sorghum  test". 
Slightly  larger  plots  are  advocated  by  Swans on  (1930).  When  protected  by  borders, 
2  and  4-row  plots  8  rods  long  having  an  area  from  l/pO  to  I/25-acre,  are  regarded  as 
convenient  units.  At  Kansas,  four --row  plots  about  100  feet  long  are  used.  The  grain 
sorghums  are  thinned  to  eight  inches  in  the  row.,  while  the  forage  sorghums  are  spaced 
four  inches  in  the  row.  The  rows  are  spaced  the  same  distances  apart  as  for  corn. 
(5)  Alfalfa  and  Clovers:  These  crops  are  usually  grown  in  field  plots  about  seven 
feet  wide  and  60  feet  or  longer  in  length,  with  the  center  five  feet  harvested  with 
a  mower.  Tysdal  and  Kiesselbach  (1939)  state  that  the  most  serviceable  types  of  plot 
for  advanced  nrrsery  testing  appear  somewhat  optional  among  these:   (a)  Solid-drilled 
5  to  8  rows  sioaced  7  inches  apart  with  a  12  to  14-inch  alley  between  border  rows;  or 
(b.)  .  solid-drilled  3  bo  5  rows  spaced  12  inches  apart  with  an  13-inch  alley  between 
border  rows.  The  entire  plot  may  be  harvested  since  very  little  error  due  to  border 
effect  occurs.   (c)  Single  rows  spaced  18  to  24  inches  apart  are  permissibile  for 
preliminary  nursery  tests.   (6)  Sugar  Beets:  Immer  (1932)  states  that  four-row  plots 
are  the  most  efficient.  The  rows  should  be  two  to  four  rods  long,  spaced  20  to  22 
inches  apart,  and  the  plants  thinned  to  about  12  inches  in  the  row. 

VI .  Relation  of  Shape  _to  Bel  lability 


Some  investigators  have  found  that  long  narrow  plots  best  overcome  the  effects  of 
soil  heterogeneity,  while  others  believe  that  plots  should  be  approximately  square. 
For  example,  Barber  (1914)  reported  that  a  small  square  plot  affords  a  more  accurate 
basis  for  variety  comparisons  than  a  long  narrow  plot  that  has  extra  growth  along  the 
borders  when  alleys  exist  between  the  plots.   On  the  other  hand,  Kiesselbach  (1918) 
showed,  that  the  coefficient  of  variability  for  l/lO-acre  oat  plots  43  rods  by  5*5 
feet  was  3«84  per  cent,  as  compared  with  p.lS'per  cent  for  plots  lo  rods  by  I0.3  feet. 
Justessn  (.1932)  found  long  narrow  plots  to  be  more  efficient  than  the  shorter  plots 
of  the  same  area.  Mercer  and  Hall  ( 1911 )  divided  the  plots  of  a  single  variety  into 
plots  of  equal  area  but  of  different  shapes.  The  dimensions  were  20  by  12,  and  ;30  by 
5  yards.  They  found  no   significant  difference  in  variability  between  them.  Similar 
results  were  obtained  by  Stephens  and  Vinall  (1928)  with  sorghums.  Bryan  (1930)*  in 
his  work  with  various  shapes  of  corn  plots,  concluded  that  shape  is  less  important 
as  the  size  of  plot  is  reduced;.  With  plots  as  small  as  lo  hills,  either  single,  two 
or  four-row  plots  may  be  expected  to  give  similar  results. 


These,  apparent  inconsistencies  are  explained  in  the  work  of  Day  (1920)  who  harvested., 
in  five-foot  sections,  a  l/40-acre  area  uniformly  cropped  to  wheat  and  combined  the 
ultimate  units  to  form  plots  of  various  shapes.  He  found  that  plots  with- their 
greatest  dimensions  in  the  direction  of  the  least  soil  variation  are  more  variable 
than  plots  having  their  greatest  dimension  in  the  direction  of  the  greatest  variation. 
He  found  that  shape  exerted  no  influence  on  accuracy  where  soil  variation  is  as  great 
in  one  direction  as  it  is  in  the  other.  Some  of  his  data  follow: 
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Similar  conclusions  were  obtained  "by  Siao  (1939)  for  cotton  and  by  Smith  (1938)  for 
wheat . 

It  is  generally  conceded  that  relatively  long  and  narrow  plots  with  the  long  dimen- 
sion in  the  direction  of  the  greatest  soil  variation  best  overcome  the  affects  of 
soil  heterogeneity.  In  addition,  linear  plots  are  more  economical  for  cultural 
operations.  However,  the  area  occupied  by  a  single  replicate  or  block  should  approach 
a  square  in  shape  for  the  most  efficient  design. 

VII.  Practical  Considerations  in  Plot  Shape 

Width  of  plots  should  be  sufficient  to  a] low  for  the  removal  of  border  rows  when 
this  appears  desirable,  or  to  render  border  effects  negligible  when  not  removed.  The 
triple  rod -row  is  a  convenient  shape  for  small  plots,  while  large  plots  are  usually 
rectangular  in  shape  to  accommodate  farm  machinery  in  an  attempt  to  simulate  farm 
conditions. 

(a)  Adaptation  to  Farm  Machinery 

Some  multiple  of  seven  feet  provides  a  favorable  width  for  field  experiments 
as  it  permits  convenient  operations  of  the  3«5  an&  7.0-foot  farm  implements.  Kiessel- 
bach  (1928)  calls  attention  to  the  fact  that  the  multiple  of  seven  feet  will  enable 
the  use  of  the  seven-foot  disk,,  seven-foot  drill,  seven-foot  binder,  and  3.5-foot 
corn  planter  and  cultivator.  The  standards  of  the  American  Society  of  Agronomy  (1933) 
recommend  14  feet  as  a  minimum  plot  width  for  crop  rotation,  fertilizer,  and  tillage 
experiments,  while  varietal  tests  with  inter-tilled  crops  commonly  should  contain  at 
least  three  or  four  rows.  Extremely  narrow  plots,  in  the  case  of  manurial  or  fertil- 
izer tests,  make  it  difficult  to  keep  the  treatments  within  the  plot  limits. 

(b)  Calculation  of  Plot  Size 

Plots  should  be  made  an  alequot  part  of  an  acre,  e.g.,  l/40,  1/50,  l/SO-^acre 
plots.  This  sort  of  plan  is  worthwhile  because  of  the  grave  possibility  of  error  in 
computations  made  on  acres  expressed  as  decimal  fractions.  For  instance,  to  calcu- 
late the  dimensions  of  a  l/40-acre  plot  for  a  drill  seven  feet  wide,  the  steps  are  as 
follows: 


43,56o/4o  =  IO89  square  feet  in  l/40  acre. 
1,089/7   =   155.6  feet  for  length  of  the  plot. 

Hayes  (1923)  suggests  for  small  grain  nursery  rows  spaced.  12  inches  apart,  that  the 
row  length  be  adjusted  in  length  slightly  so  that  gram  yields  per  plot  can  be  con- 
verted to  bushels  per  acre  by     multiplying  by  a  simple  conversion  factor.  The 
factor  0.2  can  be  used  for  a  15-foot  row  of  oats,  the  factor  0.1  for  a  lo-foot  row 
of  wheat,  and  the  factor  0.1  for  a  20-foot  row  of  barley. 

VIII.  Calculation  of  Plot  Efficiency 

Some  uniformity  data  on  120  rod  rows  of  Haynes  Bluestem  wheat  in  bushels  per  acre,  as 
given  by  Hayes  and  Garber  (1927),  will  be  used  for  the  computation  of  plot  efficiency. 
The  method  was  suggested  by  Dr.  F.  E.  limner. 
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It  is  assumed  that  JO  varieties  are  to  be  tested,  and  that  the  investigator  desires 
to  determine  the  relative  efficiency  of  1;  2,  and  4-row  plots.  The  analysis  of 
variance  will  he  used. 

( J. )  One  Bow  per  Plot : 

Block  S(>:b)  S(x2b) 


T  727.2  528,819.84 

II  767..  6  089,209.70 

III  326.7  603,^32.89 

IV l60  JB_ ^73,33-6,  64 

Totals  3082.3  2,380,279.13 

S(x)  for  all  plots  =  3,082.30   £  =  25.685833. 

S(x2)    for  all  plots  -   80,176.37.     S(::)2/k     ■---     79,171  J+306 

S(x2)    -  S(x)2/N     =     1,004.9394 
s(x2b)   -  (sx)2     =     79/342.64  -  79,171.43     =     171.21 
•  30  N 

Variation  Sum  Mean 

due  to  D.  F.  Squares  Square 

Blocks  3  171.21  37 .0690 

Varieties   and  Error llo  833  .Jo _  7.IS73 

Total  H9"  1004.94 

( 2 )     Two  Rows  per  Plot :     ' 

Block                                                               S(xJ                                                       S(x2    ) 
_ L..b  „ , r : Jb-L._ 

I  1494.8  2,234,427.04 

n 15§Ll5 2,520,156.23 

Totals        '  -----        -  -       jogg^  4.,754,583"29 

SI:\_2   -  S(x)2     =     159,661.25     -     79,171.43     =     659.19 
2  H  ™     -2 

The  total  S(x2)    is  divided  by  2  to  place  the' results  on  a  single  plot   oasis   so 
■    that  the   common  correction  factor  S (::•;:) 2 /]R ,    can  he  used. 


S(x2-b)   -  s(x)2  a     79,243.05     -     79,171.4; 
60  N 


71   .fsP 


Variation  Sum  Mean 

due  to D.  F,  Squares  Square 

B 1 o  cks                                                       1  7 1 . o2  7 1 .  62 

Varieties  and  Error                        58  587.98  10. 13 

(3)   Your  Bcvs  Per  Blot: 

s(x2)   -  s(x)2    -    318, 950 . 59 ,  -    79,171.43    -    561.21 
4              N                        4 

Variation                 ,  Sum  Mean 

due  to I).   F.  Squares  Square 

Total                                      '                  29™  ~"           ~        "           56iT21 "      ~~  '"19.3921 
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(k)     Comparison  of  1 ,  2,  and  k   Sow  Plots 


No. 
Determinations 

Rows 
per  Plot 

No. 
Blocks 

Mean 
Square 

Mean  Square 
Basle  One  Plot 

Pet. 
Efficiency 

30 
50 
50 

1 
2 
k 

It 

2 
1 

7.1875 
10.1307 
19.5521 

7.1875 
5.0655 
I4..838O 

100.00 
70.95 
37.1^ 

U.  Replication 

B  —  Plot 
in  Experimental  Work 

Replication 

Replication  is  merely  repetition.  The  investigator-  repeats  a  variety  or  treatment 
several  times  in  a  test  in  order  to  obtain  a  moan  yield  or  value  which  is  a  more 
reliable  estimate  of  the  yield  of  the  general  population  than  that  obtained  from  a 
single  plot  of  a  treatment.  It  also  provides  the  mechanism  for  a  valid  estimate  of 
the  random  errors  in  an  experiment .  Strictly  speaking,  five  replications  of  a 
variety  refer  to  six  plots,  i.e.,  the  original  plot  and  its  repetition  five  times. 
For  the  sake  of  simplicity,  the  number  of  replications  will  be  understood  to  mean  the 
number  of  plots  grown  of  each  variety  or  treatment.  In  field  experiments,  a  single 
replicate  is  usually  planned  to  contain  one  plot  of  each  treatment  in  a  rather  com- 
pact block.  The  repetition  of  the  treatments  is  brought  about  by  the  repetition  of 
the  blocks.  This  distribution  of  plots  over  the  experimental  area  is  an  effort  to 
sample"  the  field  in  an  attempt  to  measure  and,  in  some  cases  remove,  the  influence 
of  soil  heterogeneity.  Replication  in  space  and  time  is  often  necessary.  For  exam- 
ple, it  may  be  desirable  to  repeat  an  experiment  in  other  regions  of  the  state  in 
order  to  sample  different  soil  and  climatic  conditions.   In  the  same  region,  repeti- 
tion of  the  experiment  over  a  number  of  years  may  be  necessary  to  sample  the  climatic 
conditions  in  different  seasons. 

X.  History  of  Replication 

Replication  of  experimental  plots  has  been  comparatively  recent.  In  the  old  field 
tests  large  single  plots  were  placed  side  by  side.  These  were  simple  and  effective 
for  the  demonstration  of  known  facts  so  long  as  the  differences  to  be  observed  were 
large.  However,  they  are  inadequate  as  soon  as  accurate  measurements  are  needed  be- 
cause they  do  not  take  into  account  the  tremendous  variation  in  the  soil  from  plot  to 
plot. 

Sir  John  Russell  (1931)  gives  some  of  the  early  history  of  replication.  The  Broad- 
balk  plots  at  the  Rothamsted  Experimental  Station  were  split  lengthwise  into  two 
halves  in  13^6-1+7  which,  from  that  time  onwards,  were  harvested  separately.  This  was 
the  first  duplication  of  field  experiments  so  far  as  can  be  determined.  In  18^7-^8, 
and  occasionally  afterwards,  one  half  of  each  plot  was  treated  differently  from  the 
other  with  the  result  that  they  ceased  to  be  strict  duplicates.  Better  duplication 
appears  to  have  been  practiced  by  P.  Nielsen,  founder  of  the  Danish.  Experiment  Sta- 
tion about  1870,  in  his  experiments  on  grass  mixtures  for  pastures.  Some  Norfolk 
(England)  experiments  carried  out  in  the  later  l880's  were  systematically  replicated 
as  follows:  ABCDDCBA.   In  America,  some  experimenters  began  to  use  replication  about 
1888.  Some  old  experiments  in  Kansas  were  replicated  six  times.  However,  replica- 
tion soon  fell  into  disuse  because  of  the  demand  for  information  and  due  to  limited 
land  and  funds.  Single  plots  wore  the  rule.  Nothing  further  was  done  in  England 
until  1909  when  A.  D.  Hall  (1909)  and  later  Wood  (1911)  urged  the  need  for  the  esti- 
mation of  experimental  errors.  Marked  changes  came  about  as  a  result.  S.  C.  Salmon 
(1913)  revived  duplication  of  plots  in  this  country  in  1910.  Single  l/lO-acre  plots 
were  commonly  used  for  variety  and  rate  and  date  of  seeding  tests  with  small  grains 
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at  that  time.  He  split  those  l/lO-acre  plots  into  l/50-acre  plots  and  replicated 
them  five  times  for  variety  tests.  His  rate  and  date  tests  were  replicated  three 
times.  Thus,  the  same  area  was  required  for  variety  tests  e„s  "before  and  a  smaller 
area  for  rate  and  date  tests.  Largely  as  the  result  of  his  effort  s,  the  Office  of 
Cereal  Investigations,  U.  S.  D.  A.,  provided  for  replication  in  their  work  about 
1912.  In  England  at  about  the  same  time,  Dr.  E.  S.  Beaven  designed  his  well-known 
strip  method  of  replication  which  is  especially  suited  to  variety  trials. 

A  questionnaire  sent  out  by  the  Committee  on  Standardization  of  Field  Experiments  of 
the  American  Society  of  Agronomy  in  I918  indicated  that  less  than  20  per  cent  of  the 
agronomic  workers  depended  iipon  single  plot  tests  even  though  they  had  been  the  rule 
10  years  previously.  At  present,  replication  is  considered  essential  in  modern  field 
experiments. 

XI .  Reduction  of  Error  by  Replication  .  . 

The  most  effective  method  to  obtain  greater  accuracy  in  field  experiments  as  well  as 
in  many  other  types  of  agronomic  experiments,  is  to  increase  the  number  of  replica- 
tions. It  can  be  brought  about  to  a  limited  extent  ''oj   an  increase  in  plot  size  as 
shown  by  Summerby  (1923).  However,  frequent'  replication  of  small  plots^proved  to  be 
a  more  efficient  means  to  obtain  a  high  degree  of  accuracy  than  the  use  of  the  same  . 
amount  of  lend  with  less  frequently  rer>licated  larger  plots.  Love  (1936)  gives  some 
uniformity  trial  data  with  cotton  that  indicate  the  same  trend.  The  probable  error 
for  a  2 -row  plot  20  feet  long  was  10.35  per  cent,  while  that  for  two  single  20-foot 
plots  was  9.01.  Further,  the  probable  error  for  a  single  k-vow   plot  was  9,51  while 
for  the  same  area  made  up  of  four  scattered  units  it  was  7-55  Pe**  cent.  Many  other 
investigators  have  obtained  similar  results.  Since  the  standard  error  of  the  mean 
(o-£)  is  given  by  05;  = s ,  it  follows  that  the  decrease  in  Oj?  is  proportional  to 

7  1?""" 
the  square  root  of  the  number  of  replications.  This  rule  applies  when  the  variation 

due  to  the  replicates  themselves  is  removed  from  the  error,  but  not  strictly  other- 
wise. This  can  be  illustrated  with  some  data  on  120  rod  rows  of  bluestem  wheat, 
cited  by  Hayes  and  Garber  (1927) •  The  value  of  replication  was  studied  on  the  varia- 
bility of  yields  calculated  separately  on  the  basis  of  20  determinations  and  for 
1,2,^,  and  6  systematically  distributed  plots.  The  coefficients  of  variability  were 
compared  with  mathematical  expectation  as  follows: 

No .  No .  sy st emat I cally  Mathemat leal 

determinations  distributed  plots  C.  V.         expectation 


20  I 

20  2 

20  k 

20  6 


9.05 

9.05 

S..3h 

9.03A/2  = 

=  6.1+2 

5.61 

j+ .  53 

k.hk 

3-69 

The  calculated,  coefficient  of  variability-  decreases  as  a  result  of  replication,  but 
less  rapidly  than  would  be  indicated,  by  mathematical  expectation.  This  is  attributed 
to  the  greater  land  area  used  for  several  replications  than  can  be  used  for  single- 
plot  trials  which,  on  the  average,  brings  in  soils  of  greater  difference  in  producti- 
vity than  can  be  found  in  smaller  areas.   In  this  case,  the  error  due  to  blocks  has 
not  been  removed.  That  replication  beyond  a  certain  point  may  be  impractical  Is  in- 
dicated in  some  data  compiled  by  Salmon  (1923) .  He  shows  the  relation  between  the 
number  of  replications  and  the  probable  error  of  the  mean  (expressed  In  per  cent)  as 
follows : 

Number  of  Plots        1       2       3       k      /    5       0789 


Kherson  oats 

3.7 

2.0 

2.0 

1.7 

1.0 

1.8 

1.7 

1.6 

1.5 

Alfalfa 

11.2 

713 

7-1 

5.0 

k.Q 

M 

5.5 

6.0 

5.8 

Ear  corn 

9-0 

5.9 

5.5 

5.1 

k.l 

3-7 

k.l 

k.l 

5.3 
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It  is  to  be  noted  that  variability  was  rapidly  reduced  up  to  k   replications,  but  the 
decrease  was  at  a  much  slower  rate  beyond  that  point.  Hayes  and  Garber  (192?)  ques- 
tioned whether  the  gain  in  accuracy  beyond  three  replications  warranted  the  addition- 
al work.  The  relation  of  replication  to  design  will  be  considered  in  a  later  chap- 
ter. 

XII.  Number  of  Replications 

The  question  naturally  arises  as  to  the  number  of  replications  that  should  be  used. 
Goulden  (1929)  states  that  it  depends  upon  the  degree  of  soil  heterogeneity,  the 
degree  of  precision  required,  and  the  amount  of  seed  available.  Any  desired  degree 
of  precision  within  practical  limits  may  be  ordinarily  achieved  for  any  given  set  of 
conditions  by  replication.  For  field  plots,  the  American  Society  of  Agronomy  (1953) 
recommends  3  to  6   replications,  dependent  upon  the  degree  of  precision  required.  The 
smaller  number  will  suffice  when  average  rather  than  annual  results  are  stressed. 
From  k   to  6  replications  are  commonly  used  in  corn  variety  trials.  Nursery  experi- 
ments ordinarily  should  be  replicated  5  "to  10  times  to  assure  significant  results. 
It  is  impossible  to  prescribe  a  rule  for  all  cases.  In  rod-row  trials  with  oats, 
Love  and  Craig  (1938)  found  8  or  10  replications  more  satisfactory  than  a  smaller 
number,  as  3  or  5.  In  alfalfa  nursery  plots,  Tysdal  and  Kiesselbach  (1939)  concluded 
that  k   to  1.6  replications  were  necessary  to  make  a  5  per  cent  difference  statistical- 
ly significant  for  plots  that  varied  in  size  from  l/80-acre  to  a  single  space-planted 
16-foot  row.  The  larger  plot  required  the  k   replications.  However,  little  is  to  be 
gained  by  the  use  of  more  than  10  replications  in  field  trials. 
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Questions  for  Discussion 

1.  What  was  the  early  history  of  the  use  of  field  plots? 

2.  What  type  of  experiment  is  used  to  compare  different  sizes  and  shapes  of  plots? 
Why? 

3.  What  practical  considerations  usually  determine  the  size  of  plots  used  in  field 
experiments? 

b.   What  is  the  general  objective  of  nursery  tests  and  what  general  relation  should 
they  have  to  field  plots? 

5.  Distinguish  between  nursery  and  field  plots. 

6.  What  are  the  common  sizes  of  nursery  plots?  Field  plots? 

7.  Compare  nursery  plots  and  field  plots  as  to  accuracy. 

8.  What  is  the  relation  between  size  of  plots  and  the  standard  error?  Between  size 
of  plot  and  border  effect? 

9.  How  may  increased  size  of  plot  increase  the  amount  of  variability? 

10.  What  size  of  plot  has  shown  the  lowest  variability  for  practical  purposes  with 
wheat?  Corn?  Soybeans?  Millet?  Cotton? 

11.  What  is  meant  by  efficiency  in  plot  size? 

12.  What  reasons  can  be  given  for  the  variability  in  results  with  small  plots? 

13.  What  is  a  common  size  of  plots  for  small  grain  nurseries?  Corn  trials?  Sorghums? 
Alfalfa?  Sugar  beets? 

Ik.   In  general,  what  relation  is  found  between  shape  of  plots  and  the  standard  error? 

15.  What  relation,  if  any,  is  found  between  the  direction  in  which  plots  extend  and 
the  standard  error?  Why? 

16.  What  recommendations  would  you  make  on  width  of  field  plots  for  the  use  of  farm 
machinery?  Why? 

17.  What  relation  is  found  between  shape  of  plots  and  border  effect? 

18.  What  modifications  can  be  made  in  length  of  nursery  rows  for  wheat,  oats,  and 
barley,  for  rapid  conversion  of  yields  to  bushels  per  acre? 

19.  What  is  replication?  Why  used? 

20.  What  has  be«*s  the  general  practice  regarding  replication  of  plots?  What  is  the 
practice  now? 

21.  Trace  the  early  history  of  plot  replication. 

22.  What  serious  results  may  result  from  single  (unreplicated)  plot  trials?  Why? 

23.  What  is  the  theoretical  relation  between  the  number  of  plots  and  the  standard 
error?  Actual  relation?  Why  do  they  not  always  agree? 

2b.   What  class  of  errors  does  plot  replication  tend  to  reduce  or  eliminate?  On  what 

class  does  it  have  no  effect? 
23.  Diccuss  the  statement:   "Precision  can  be' increased  Indefinitely  by  replication." 


26.  How  does  replication  furnish  an  estimate  of  error' 

27.  Give  a  general  rule  or  rules  for  plot  replication. 


Problems 

1.  It  is  desired  to  use  l/30-acre  plots  in  a  crop  rotation  experiment  and  to  make 
them  lh   feet  wide.  Calculate  the  plot  length. 

2.  Gome  data  reported  by  Wiebe  (1935)  are  given  below  for  15-foot  rows  of  wheat;  one- 
foot  apart.  The  yields  are  reported  in  grams.  Assume  I5  varisties,  and  compute 
the  efficiency  for  1,  2,  and  4 -row  plots. 

Series  2  Series  3  Series  k 


Scries  1 

715 

770 

76O 

663 

753 

7^-5 

6hi 

7.85 

360 

685 

755 

61+0 

725 

715 

700 

595  380  580 

710  655  67.5 

715  690  690 

613  685  353 


730  670  380 

670  585  560 

690  530  520 

1+93  ^55  ['70 

5I+0  lj-50  300 

730  610  500 

810  665  570 

635  585  ^63 

C53  530  ■lr55 

773  615  5^5 

705  355  Mo 

5.  Calculate  the  number  of  replications  required  to  make  a  5  per  cent  difference  in 
yield  statistically  significant  for  these  sizes  of  plots: 

Kind  of  Plot  05  (in  per  cent) 


Field  plot  '        3.3 

Single -row  plot  (l'8-inch  spacing)  3-2 

Single -row  plot  (2l+-inch  spacing)  hand -planted  7.0 


Use  the  formula,  Oj  (2),/ 2    =  5  (percent  difference  in  yield?),  where  "n"  =  th« 
number  of  'replications. 


____-_^^^^^^^^^^^^^^_ 


CHAPTER  XIV 
COMPETITION  AND  OTHER  PLANT  ERRORS 


I.  Plants   in  Relation  to  Error 

That  soil  heterogeneity  -will  contribute  to  experimental  error  has  already  "been  seen. 
There  are  also  many  errors  due  to  plants  that  may  contribute  to  the  experimental 
error.  These  may  be  caused  by  differences  in  genetic  constitution  or  variations  due 
to  environmental  conditions.  Variations  in  plant  stand  within  plots  may  introduce 
differential  responses  due  to  intra-plot  competition,  while  the  effect  of  one  plot 
on  the  adjacent  one  may  bring  about  differences  due  to  inter -plot  competition.  Other 
errors  related  to  plants  include  such  "effects  as  differences  in  the  moisture  content 
of  the  harvested  crop,  differences  in  adaptation,  etc 

II.  Acclimatization 

il  serious  systematic  error  may  be  introduced  thru  differences  in  acclimatization  of 
the  crops  under  test,  unless  acclimatization  itself  is  the  factor  under  consideration. 
Varieties  in  crops  like  corn,  alfalfa,  and  red  clover  may  vary  widely  in  their  clima- 
tic adaptation.  Variety  tests  in  corn  may  be  a  common  source  of  error  in  this  re- 
spect. In  Nebraska,  Kiesselbach  (1922)  compared  Reid  Yellow  Dent  corn  grown  100 
miles  farther  north  with  that  grown  and  adapted  at  Lincoln.  He  obtained  large  dif- 
ferences in  yield,  plant  height,  date  of  maturity,  length  of  ear,  etc.,  within  the 
same  variety  when  originally  grown  under  different  conditions.  Lyon  (19H)  reported 
similar  results  for  corn  and  also  for  strains  of  Turkey  wheat  from  other  states  in- 
cluded in  winter -hardiness  tests.  Differences  in  varieties  may  be  brought  about  in 
a  very  few  years  which  may  introduce  either  a  slight  or  a  very  large  error.  Reliable 
tests  are  impossible  when  varieties  are  collected  from  different  climatic  regions. 
Each  variety  should  be  grown  for  a  year  or  two  in  the  region  where  it  is  to  be  test- 
ed until  it  has  undergone  the  changes  incidental  to  adaptation  to  the  new  environment. 

III.  Plant  Individuality 

Plant  individuality  varies  with  different  crops.  It  is  more  marked  in  cross  than  in 
self -fertilized  crops,  e.g.,  it  would  be  more  important  in  rye  than  in  barley.  The 
size  of  plot  necessary  is  influenced  by  the  number  of  plants  grown  per  plot,  as  well 
as  by  the  kind  of  plant.  For  instance,  it  is  easily  possible  to  have  1,000,000  wheat 
plants  on  one  acre,  while  the  number  of  corn  plants  is  only  about  10,000  per  acre. 
Plant  individuality  would  be  negligible  in  the  case  of  small  grains,  but  quite  impor- 
tant in  crops  like  corn  and  sorghums  where  the  number  of  plants  per  plot  may  be  quite 
low.  Lyon  (19H)  found  that  quite  a  large  error  may  be  introduced  by  yield  deter- 
minations from  a  small  number  of  plants  due  to  the  variations  in  growth  of  certain 
individuals.  For  maize,  he  showed  that  the  effect  of  plant  individuality  was  prac- 
tically none  when  each  plot  was  composed  of  100  plants. 

IV .  Variation  in  Moisture  Content  of  Harvested  Crop 

In  forage  and  cereal  crops  the  variation  in  moisture  content  of  the  harvested  crop 
may  be  an  important  source  of  error  in  yield  determinations.  For  precise  experimen- 
tal results,  this  condition  should  be  recognized  and  a  remedy  provided  for  it. 

(a)  Moisture  in  Forage  Crops 

Obviously,  the  most  accurate  method  to  determine  the  water  or.  dry-matter  con- 
tent of  the  forage  grown  on  a  plot  is  to  dry  all  the  material  to  a  water -free  basis. 
Since  it  is  impossible  to  do  this,  dry  matter  determinations  are  based  on  small 
shrinkage  s  amp  1 e  s . 
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The  problem  of  moisture  determination  is  rather  simple  under  semi -arid  con- 
ditions where  forage  is  readily  field-cured.  Forage  weights  arc  usually  taken  after 
the  material  is  dry  enough  to  stack,  talcing  a  shrinkage  sample  at  that  time.  The 
sample  is  weighed  immediately  and  allowed  to  air -dry  for  2  or  3  weeks  after  which  it 
is  re-weighed  for  the  air-dry  weight.  Yields"  corrected  on  this  "basis  are  found  to 
he  reliable.   In  case  moisture -free  determinations  are  necessary,  the  samples  of  each 
variety  or  treatment  may  be  composited,  ground,  and  dried  in  a  vacuum  oven  for  12  to 
2k   hours . 

Under  humid  conditions,  reliable  comparisons  from  the  weights  of  field-cured 
forage  cannot  be  made,  except  on  rare  occasions  that  cannot  be  predicted.  As  a  re- 
sult of  the  work  of  Farrell  (191*1-);  McKee  (191k) j   'Vinall  and  McKee  (1916),  and  Amy 
(I9l6),  the  general  practice  has  been  to  weigh  the  forage  as  soon  as  cut,  and  sampling 
it  for  air-dry  or  water-free  determinations  at  that  time.  These  green  samples  are 
usually  placed  in  a  drier" at  once  to  avoid  the  loss  of  dry  matter  thru  oxidation, 
fermentation,  etc.  The  investigations  of  Wilkins  and  Eyland  (.1938)  indicate  that  the 
samples  should  be  taken  and  weighed  within  k   to  6  minutes  after  the  forage  is  cut  to 
avoid  error  due  to  moisture  less.  These  workers  -also  found  that  the  error  introduced 
thru  the  use  of  green  weights  of  alfalfa  and  red  clover  for  plot  yields  without  dry 
matter  determinations  was  negligible  so  long  as  the  weights  were  taken  quickly. 
Yield  determinations  on  the  basis  of  green  weights  proved  to  be  as  accurate  as  where 
2  or  3  samples  per  plot  served  as  a  basis  for  moisture  determination,  and  subsequent 
yield  c orr e ct  i  on . 

(b)  Moisture  in  Cereals 

The  moisture  content  of  small  grains  is  usually  of  little  consequence  since 
the  bundles  are  usually  air-dry  before  threshing  is  attempted.  The  threshed  grain 
may  be  weighed  at  the  threshing  machine  and  re -weighed  a  week  later  to  be  sure  it 
has  reached  a  uniform  moisture  content.  The  determination  of  moisture  in  shelled 
corn  is  regarded,  as  an  essential  practice  for  obtaining  precise  yields.   Moisture 
determinations  can  be  made  on  each  plot  of  each  variety,  or  a  composite  determination 
for  the  variety  on  all  replications.  A  common  practice  is  to  report  yields  on  the 
basis  of  shelled  corn  with  15 • 5  per  cent  moisture,  the  maximum  moisture  permitted  for 
U.S.  No.  2  corn.  Moisture  determinations  for  corn  or  small  grains  can  be  made  quick- 
ly with  the  Brown-Duvel  moisture  tester  described  oy   Coleman  and  Boerner  (1927) • 
Recently,  the  Tag-Heppensta.ll  moisture  meter,  an  electrical  device,  .has  been  widely 
used  for  moisture  determinations.  This  meter  is  calibrated  for  wheat,  corn,  oats. 
barley,  rye,  sorghums,  rice,  soybeans,  and  vetch.  The  electrical  moisture  meter 
has  certain  advantages  for  practical  work:   (l)  It  is  unnecessary  to  clean  after 
each  sample;  (2)  the  sample  is  not  weighed;  (3)  a  single  determination  can  be  made 
in  less  than  one  minute;  (4)  it  will  duplicate  results  within  a  tolerance  that  can- 
not be  met  in  a  single  determination  by  other  methods,  and  (5)  the  operation  and 
maintenance  cost  is  low.  Cook,  et  al  (I93V)  have  made  a  study  of  rapid  moisture 
determination  devices.  When  determinations  are  made  on  each  variety  in  each  repli- 
cate, single  rather  than  duplicate  determinations  should  be  sufficient. 

V .  Competition  Concept  in  Plants 

A  "struggle  for  existence''  results  when  plants  are  grouped  or  occur  in  communities  in 
such  a  way  that  the  demands  for  an  essential  factor  are  in  excess  of  the  supply. 
This  is  true  in  many  field  trials.  Competition  always  occurs  when  two  or  more  plants 
make  demands  for  light,  nutrients  or  water  in  excess  of  the  supply.  It  is  greatest 
between  individuals  of  the  same  species  which  make  similar  demands  upon  the  supply  at 
the  same  time.  This  is  generally  the  case  in  cultivated  crops  where  an  area  is 
planted  to  the  same  species  or  variety.  A  detailed  discussion  on  the  nature  of  plant 
competition  is  given  by  Clements  and  others  (1929) 
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A  number  of  investigations  have  "been  conducted  to  determine  the  importance  of  plant 
competition  in  experimental  plots.  There  is  apt  to  "be  an  effect  -when  varieties  that 
differ  considerably  in  growth  habit,  time  of  maturity,  and  other  characters  are 
grown  in  adjacent  plots.  The  principal  contention  is  whether  or  not  the  yield  of  a 
poorer  variety  growing  next  to  a  high  yielding  variety  will  he  adversely  affected  so 
that  the  yield  will  be  actually  lower  than  when  the  variety  is  grown  next  to  a  plot 
of  its  own  kind. 

Competition  may  or  may  not  influence  plot  yields.  Two  distinct  schools  of  opinion 
have  arisen  as  to  its  importance.  In  areas  of  limited  moisture  supply,  competition 
has  been  generally  found  to  be  a  source  of  error  in  comparative  crop  tests.  Kiessel- 
bach  (1918)  obtained  errors  of  2k   and  ^6  per  cents  due  to  plant  competition  in  two 
different  years.  Hayes  and  Arny  (1917)  found  errors  in  small  grain  yield  trials 
where  varieties  competed  with  each  other.  In  Missouri,  Stadler  (1921)  reported  errors 
of  50  to  100  per  cent  due  to  plant  competition.  Some  workers  in  the  more  humid  re- 
gions, where  moisture  is  often  sufficient  throughout  the  season  for  ordinary  stands, 
consider  competition  effects  unimportant.  For  example,  Stringfield  (1927)  found  only 
occasional  disturbances  in  Ohio,  while  Garter  and  Odland  (1925)  failed  to  find  evi- 
dences of  competition  in  adjacent  soybean  rows  on  the  West  Virginia  Station.  Love 
and  Craig  (1958)  concluded  that  the  effect  of  competition  is  not  serious  enough  to 
influence  the  yields  of  wheat  and  oats  under  New  York  conditions. 

The  influence  of  plant  competition  depends  upen  the  test  being  conducted,  but  the 
possible  error  from  this  source  should  be  kept  in  mind  constantly.  It  is  a  safer 
procedure  to  eliminate  or  provide  for  this  source  of  error  than  to  be  led  to  erron- 
eous conclusions  by  overlooking  it. 

A  --  Intra -plot  Competition 

VI.  Uneven  Plant  Distribution  in  Plots 

Plants  within  a  plot  are  in  competition  with  each  other  when  some  factor  such  as 
moisture  is  present  in  insufficient  quantities.  Uneven  plant  distribution,  with  a 
normal  number  of  plants  per  plot,  was  studied  in  corn  by  Eiesselbach  and  Weihing 
(1955)  to  determine  whether  or  not  this  condition  would  alter  acre  yields.  Corn  was 
planted  in  hills  3*5  feet  apart  so  as  to  average  three  plants  per  hill.  The  three 
systematically  uneven  distributions  were  planted  so  as  to  have  2-*J-,  1-3-5*  and 
1-2-3 J+-5  plants  in  alternate  hills.  Essentially  uniform  stands  of  three  plants  per 
hill  were  grown  for  comparison.  During  a  ik-je&r   period  the  systematically  variable 
stands  of  2-k,    1-3-5-,  and  1-2 -3 -4-5  plants  per  hill  averaged  50.6,  1*9,3,  and  50.0 
bushels  per  acre,  respectively.  The  three  variable  stands  averaged  50. 0  bushels 
while  the  uniform  3 -plant  stand  averaged  ^9*9  bushels  per  acre.  In  another  trial, 
these  invest i gators  tested  a  random  variable  stand  by  planting  corn  that  germinated 
100,  75;  60,  and  50  per  cents  at  adjusted  rates  to  average  three  viable  kernels  per 
hill.  The  plot  yields  for  a  single  season  were  2>+.96,  2p.50,  25.3^,  and  25.12 
bushels  per  acre  for  the  respective  germination  per  cents.  From  these  data,  it  was 
concluded  that  systematically  and  randomly  variable  stands  did  not  affect  the  yields 
so  long  as  the  same  number  of  plants  occurred  on  a  plot.  The  authors  caution  that 
"experience  has  indicated  that  stand  irregularities  materially  greater  than  those 
herein  considered,  such  as  are  sometimes  caused  by  rodents,  worms,  birds,  and  soil 
washing,  would  undoubtedly  increase  plant  variability  and  lower  the  yield." 

A  similar  type  of  study  was  conducted  by  Smith  (1937)  on  2  Australian  wheat  fields 
planted  by  a  farm  drill  in  which  short  lengths  of  drill  row  were  harvested  separate- 
ly. Variability  of  plant  density  as  found  in  a  drill-sown  field  did  not  b;y  itself 
cause  a  decrease  in  yield  of  grain  as  compared  to  even  spacing  of  seed.  The  correla- 
tion of  yield  and  plant  number  per  foot  of  drill  row,  which  is  invariably  observed 
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in  small  grain  fields,  was  said,  to  "be  due  to  the  effects  of-  competition  "between  near- 
ly densities.  He  makes  this  statement:   "The  true  correlation  "between  yield  and 
plant  density  per  area  may  he  positive ,  negative ,  or  zero  according  to  circumstances . 
The  yield  from  variable  seeding  may  he  less  than,  equal  to,  or  even  slightly  greater 
than  the  yield  from  even  seeding,  according  to  how  near  the  even  seeding  may  he 
optimum  for  the  given  conditions  and.  how  far  the  variable  seeding  may  fall  within  or 
overlap  the  optimum  range  within  which -plant  density  is  of  little  importance." 


VII .  Differences  in  Stand  in  Plots 

The  ideal  condition  for  yield  trials  is  a  perfect  stand,  on  all  plots,  hut  this  is 
not  always  attained.  Some  allowance  must  he  made  for  the  lost  area  where  the  stand, 
is  injured,  by  outside  influences,  particularly  with  some  crops.  When  the  loss  in 
stand  is  due  to  the  treatment,  I.e.,  it  injures  germination,  destroys  part  of  the 
plants,  or  in  any  manner  is  directly  responsible  for  stand,  the  use  of  a  perfect  - 
stand  basis  for  yield  calculations  eliminates  the  effect  of  the  treatment.  The 
lethal  effect  of  the  treatment  may  be  a, definite  part  of  the- results  obtainable  and. 
should,  be  given  consideration.  The  effect  of  stand  within  plots  is  more  of  a  prob- 
lem in  crops  like  corn,  sorghums,  certain  legumes,  and  sugar  beets  where  the  plants 
are  large,  variable  inter  se",  and  subject  to  the  influences  of  plant  competition. 
Less  difficulty  is  experienced  in  small  grains  because  the  plants  tend  to  tiller  and 
fully  utilize  the  extra,  space. 


(a)  Competition  between  Unlike  Kills  in  Corn 

Relative  yields  of  one,  two,  and  three -plant  corn  hills  uniformly  surrounded 
~bj   three-plant  hills  was  studied  ~bj   Xiessolbach  (1918)  (l'f'2'd).  He  found  the  yields 
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obvious  that  the  fewer  plants  per  hill  made  some  use  of  the  additional  space.  In 
another  test, the  relative  yields  of  three-plant  hills  were  compared  when  adjacent  to 
hills  with  various  numbers  of  plants. 


3-plant  hills  surrounded  by  3-plant 
hills  except  as  Indicated  below: 


Average  Grain  Yield  per  Hill 
Actual  (lbs.)      Relative  (pet.) 


Surrounded,  by  3 -plant  hills 
Adjacent  to  one  hill  with  2  plants- 
Adjacent  to  one  hill  with  3.  plant 
Adjacent  to  one  blank  hill 
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it  is  obvious  that  3-plant  hills  adjacent  to  blank  or  1  or  2 -plant  hills  tend  to 
yield  higher  than  when  surrounded  by  3-pla-tdc  hills.   In  comparisons  of  inbred  lines 
and  F-|_  hybrids,  Brewbaker  and  lamer  (1951)  found,  that  a  rather  large  error  may  be 
introduced  in  yields  where  hills  have  reduced  stands  or  are  adjacent  to  hills  which 
lack  in  stand,  however,  under  a  3-plant  rate  in  com  it  is  generally  conceded  that 
10  to  15  per  cent  of  the  stand  may  be  lost  before  the  yield  is  measurably  reduced  or 
the  experimental  accuracy  affected  in  ordinary  yield,  trials. 

(b )  Competitive  vs .  Non-Competitive  Yields  in  Sugar  3eets 

Sugar  beet  tonnages  are  usually  reported,  as  (I~)  total  weight  of  all  beets  on 
a  unit  area,  or  as  (2)  a  calculated  yield,  from  "normally  competitive"  beets.,  The 
beets  which  serve  as  the  basis  for  calculation  are  those  grown  surrounded  by  neigh- 
bcrs  on  all  sides  at  appropriate  distances  for  the  conditions  imposed  in  the  experi- 
ment., A  study,  of  the  response  of  sugar  beets  to  increased  space  allotment  was  made 
by  Brewbaker  and  Deming  0-95.:>)«  Their  plants  were  grown  in  20 -inch  rows  and  thinnt  d 
to  12  inches  between  plants  in  the  row.  Beets  around  a  single  blank  space  were  foun^ 
to  increase  in  weight  sufficient  to  compensate  for  pfc>.2  per  cent  of  the  loss  of  a 
single  beet.  They  obtained  increases  of  28.7,  39-2,  and.  95-0  per  cents  for  beets 
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adjacent  to  one  "blank  space  in  the  same  row,  •  "between  two  "blank  spaces  in  the  same 
row,  and  with  "blanks  on  four  sides,  respectively.  It  is  evident  that  the  "beet  weight 
was  greatly  influenced  "by  the  relative  area  available  for  its  development.  The  re- 
gression of  weight  of  "beets  upon  stand  was  essentially  linear  for  stands  between  25 
and  75  per  cent .  For  each  10  per  cent  increase  in  stand  there  was  an  increase  from 
O.76  to  2.10  tons  beets  per  acre  for  the  regression  within  blocks.  There  may  be  sit- 
uations where  yields  based  on  competitive  beets  would  be  in  error,  particularly  in 
poor  stands  and  in  spacing  tests .  Such  instances  have  beer  pointed  out  by  Nuckols 
(1956).  He  harvested  actual  and  competitive  beets  on  25<+  plots  where  the  stand 
varied  from  50  to  100  per  cent .  A  greater  difference  in  competitive  and  actual 
yields  was  obtained  for  poor  stands  than  for  good  stands.  In  fact,  the  mathematical 
possibilities  showed  that  there  are  only  35  P^r  cent  of  competitive  beets  in  a  90 
per  cent  stand,  "'0  per  cent  in  an  80  per  cent  stand,  and  5  per  cent  in  a  70  per  cent 
stand.  This  indicates  the  greater  possibility  for  error  when  competitive  beets  are 
taken  from  poor  stands.  Nuckols  also  found  an  indication  that  there  is  a  greater 
difference  between  competitive  and  actual  yields  where  the  beets  are  closely  spaced 
than  where  more  widely  spaced  in  the  row.  It  is  obvious  that  in  rate  of  spacing 
tests,  the  method  of  selection  of  competitive  beets  is  not  the  same  for  all  plants. 

(c)  Stand  Effects  in  Other  Crops 

In  potatoes,  Livermore  (1927)  reports  that  the  yield  of  the  two  hills  adja- 
cent to  a  blank  may  be  kO   per  cent  more  than  that  for  hills  surrounded  by  hills. 
Werner  and  Kiesselbach  (1929)  found  62.5  per  cent  of  the  loss  in  yield  was  recovered 
in  potatoes  adjacent  to  one-hill  blanks.  In  alfalfa  nursery  plots,  Tysial  and  Kies- 
selbach (1959)  found  for  variable  seed  rates  that  stands  tended  to  equalize  after 
k   years.  Considerable  latitude  in  the  amount  of  seed  sown  per  row  was  possible  with- 
out serious  effects  on  comparative  varietal  performance. 

VIII.  Corrections  for  Uneven  Stands 

A  great  deal  of  attention  has  been  given  to  possible  corrections  for  loss  of  stand. 
It  should  be  emphasized  that  ,there  is  no  entirely  satisfactory  method  to  correct  for 
uneven  stands,  it  being  better  practice  to  prevent  them  so  far  as  possible.  One 
method  to  avoid  poor  stands  is  to  plant  thick  and  thin  the  young  plants  to  the  de- 
sired rate.  For  example,  where  a  stand  of  3 -plants  per  hill  is  desired  in  com,  the 
experimenter  may  plant  6  kernels  per  hill  and  subsequently  thin  the  seedling  plants 
to  3  per  hill.  Most  empirical  methods  for  the  correction  of  yields  on  a  stand  basis 
are  based  upon  plants  surrounded  by  the  normal  stand,  i.e.,  competitive  plants. 
Stewart  (1919)  (1921)  gives  a  formula  for  the  correction  of  stand  errors  in  potatoes 
where  the  stand  is  relatively  satisfactory.  The  practice  in  corn  experiments  is  to 
harvest  the  entire  plot  without  stand  corrections  when  the  stand  is  90  per  cent  of 
the  theoretical  or  better.  For  less  than  that,  it  is  usually  harvested  on  a  perfect - 
stand  basis.  Kiesselbach  (1918)  (1923)  selects  only  perfect-stand  hills  surrounded 
by  hills  with  the  same  stand  and  computes  the  yields  from  these.  Bryan  (1933)  found 
that  26  per  cent  fewer  hills  were  required  to  obtain  any  given  degree  of  precision 
with  only  perfect-stand  hills  than  with  all  hills  regardless  of  stand.  Adjustment 
of  the  yields  of  perfect  stand  hills  further  reduced  the  number  of  hills  required  for 
any  degree  of  precision  by  18.9  per  cent.  The  procedure  in  the  U.  S.  Department  of 
Agriculture  for  the  uniform  com  hybrid  tests  is  to  adjust  yields  for  missing  hills 
but  not  for  minor  variations  in  stand. 

Probably  the  most  satisfactory  method  for  the  adjustment  of  yields  on  the  basis  of 
stand  is  by  covariance  in  which  the  regression  coefficients  are  calculated.  Mahoney 
and  Baten  (1939)  have  outlined  its  use  for  this  purpose.  When  there  is  a  fairly  high 
variation  due  to  soil  heterogeneity  and  no  appreciable  differences  in  stand,  usually 
nothing  is  gained  by  adjustment* 
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B  --  Inter-plot  or  Border  Effect  Competition 

IX.  Types  of  Inter^plot  Competition 

Many  studies  have  been  conducted  to  determine  the  border  effects  of  adjacent  plots. 
The.  committee  on  the  standards  for  the  conduct  of  field  experiments  for  the  American 
Society  of  Agronomy  (3-935)  makes  this  statement:   "In  a  majority  of  soil  experiments 
and  in  many  cultural  and  variety  tests-,,  plot  yields  may  be  modified  by  contiguity  to 
other  treatments,  crops,  or  interspaces.  Border  competition  in  adjacent  unlike  plots 
often  raises  some  yields  and  lovers  others."  A  vigorous  variety  may  benefit  when 
grown  next  to  a  poor  one,  particularly  in  single-row  plots.  The  same  type  of  error 
may  be  introduced  in  rate  and  date  of  planting  tests.  As  a  result,  multiple-row  plots 
are  often  used  in  experimental  work  with  the  border  rows  discarded.  This  procedure 
is  justified  on  the  basis  of  experimental  data  which  indicate  that  the  yield  order 
may  be  changed  when  border  rows  are  included  in  the  plot  yields,  according  to  Arny 
(1921) .  In  some  fertilizer  and  cultural  experiments  alleys  between  plots  are  neces- 
sary because  the  treatment  may  spread  to  the  next  plot  through  faulty  application. 

X.  Effect  in  Variety  Tests 

Most  tests  to  determine  the  amount  of  inter-plot  competition  have  been  on  the  basis 
of  single-row  vs.  multiple  row  plots  with  the  borders  discarded. 

(a)  Small  Grains 

It  is  concluded  by  Hayes  and  Amy  (1917)  that  there  is  considerable  competi- 
tion between  rod  rows  of  small  grains  when  grown  one-foot  apart.  This  led  to  the 
adoption,  of  three-row  plots  for  small  grain  variety  tests  at  the  Minnesota  station. 
Comparisons  of  three-row  plot  yields  with  the  central  rows  showed  that  the  latter 
are  as  accurate  for  yield  determinations  as  attained  by  the  use  of  all  three  rows. 
Kiesselbach  (I9I8)  found  that  competition  caused  Big  Frame  wheat  to  yield  10. 3  and 
12.!+  per  cent  too  high  in  1913  an-  191^>  respectively,  when  grown  in  alternate  rows 
with  Turkey.  Burt  oats  yielded  lo  and  38  per  cent  too  high  for  these  years  when 
grown  in  alternate  rows  with  Kherson.  Stadler  (1921)  found  competition  in  small 
grains  to  be  more  extreme  between  different  varieties  than  between  different  commer- 
cial strains  of  the  same  variety.  As  a  result,  it  is  almost  the  universal  practice 
to  grow  small  grains  in  multiple-row  plots  and  discard  at  least  one  border  row  from 
each  side,  at  harvest  for  small  grain  nursery  plots.  The  use  of  single-row  or  3-row 
plots  with  all  rows  harvested  appears  possible  under  humid  conditions  where  competi- 
tion appears  to  be  slight.   (See  Love  and  Craig,  1938) 

(b)  Cora. 

As  early  as  1909?  Smith  (1909)  found  one-row  plots  too  narrow  for  fair  tests 
in  corn  when  varieties  of  diverse  characteristics  were  planted  in  adjacent  rows. 
A  variety  with  short  stalks  was  at  a  disadvantage  when  grown  next  to  a  taller  one 
becau.se  of  shading;  or  a  variety  with  "strong  foraging  powers"  may  compere  more  suc- 
cessfully for  moisture  and  plant  food  over  a  weaker  or  slower  growing  neighbor. 
Kiesselbach  (1922)  (1923)  found  that  where  large  and  small  varieties  of  corn  were 
grown  in  alternate  rows,  the  smaller  variety  yielded  66  per  cent  as  much  as  the 
larger  one,  and  only  k'J   per  cent  as  much  when  both  were  planted  in  the  same  hill. 
The  smaller  variety  yielded  85  per  cent  as  much  when  planted  in  alternate  5-row  plots 
ana  the  three  center  rows  harvested  for  yield.  That  the  smaller  variety  was  being 
robbed  of  light,  water,  and  nutrients  was  shown  by  the  yields  where  each  variety  was 
surrounded  by  its  own  kind. 

"(c)  Other  Crops 

Competition  between  soybean  varieties  was  studied  by  Brown  (1922)  in  Connecti- 
cut. Twenty-five  single -row  check  plots  of  a  small  and  early  soybean  variety  averaged 
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26.9  "bushels  of  seed  per  acre.  When  the  check  was  adjacent  to  larger  and  later 
varieties  like  Mammoth  Yellow,  the  checks  averaged  only  17.1  bushels  or  63.6  per 
cent  as  much  as  the  average  of  all  checks.  In  potatoes,  he  concluded  that  yields 
were  not  influenced  "by  competition  between  single-row  plots. 

In  alfalfa,  solid-drilled  plots  with  a  7-inch  row  spacing  has  been  shown  to 
be  definitely  subject  to  serious  interplot  varietal  competition.  The  work  of  Tysdal 
and  Kiesselbach  (1939)  indicates  that  the  effects  could  be  overcome  when  the  border 
rows  were  discarded  at  harvest.  When  the  alley  space  between  plots  was  widened  to 
12  inches  a  significant  interaction  between  varieties  was  also  prevented.  The  rela- 
tive yields  from  single  or  multiple-row  plots  with  either  18  or  2^-inch  row  spacing 
likewise  exhibited  no  significant  differential  interaction. 

Immer  (193^)  made  a  study  of  the  effect  of  competition  between  adjacent  rows 
of  different  varieties  of  sugar  beets,  i.e.,  "Old  Type"  and  "Extreme  Pioneer". 
These  were  grown  in  alternate  single-row  plots  and  also  in  4-row  plots  with  the  bor- 
der rows  removed  for  yield  deteiminations.  When  grown  in  single-row  plots  the  "Old 
Type"  brand  yielded  3«78  +  O.kk   tons  more  per  acre  than  "Extreme  Pioneer."  In  ^-row 
plots,  with  the  central  two  rows  alone  being  harvested,  the  increase  of  "Old  Type" 
over  "Extreme  Pioneer"  was  only  I.78  *  O.Jl  tone  per  acre.  The  difference  between 
these  two  differences  was  2.00  *  0.5^  tons,  a  value  that  is  significant.  Thus,  "Old 
Type,"  the  higher  yielding  sort,  profited  at  the  expense  of  "Extreme  Pioneer"  when 
these  two  brands  were  grown  side  by  side  in  single-row  plots. 

In  cotton  variety  tests,  Christidis  (1937)  found  that  competition  may  cause 
a  definite  bias  in  the  estimation  of  comparative  yields  of  cotton  varieties.  Han- 
cock (1936)  tested  two  cotton  varieties  with  diverse  growth  characteristics.  The 
varieties  were:  Acala,  a  late  tall  variety,  and  Delfos,  an  early  semi -dwarf  type. 
The  varieties  were  arranged  in  these  combinations  with  the  series  alternated:  DDDDAD 
and  AAAADA.  He  observed  that  Delfos  with  Acala  on  on3.y  one  side  (DDA)  showed  very 
small  differences  when  compared  with  themselves  between  their  own  border  rows  (DUD). 
For  instance,  DDD  as  an  average  for  four  years  produced  only  1 ,h   per  cent  more  seed 
than  DDA,  while  AAA  produced  ^.01  per  cent  less  than  AAD.  Where  two  rows  of  the 
same  variety  are  planted,  only  one  row  would  be  affected  by  a  different  variety. 
Since  he  found  this  effect  to  be  small,  two-row  plots  were  advocated  with  both  har- 
vested for  yield.  Such  a  procedure  may  be  satisfactory  under  conditions  of  abundant 
moisture,  but  would  be  questionable  where  habitat  factors  are  severely  limited. 

XI.  Rate  and  Date  Tests 

Under  most  environmental  conditions  competition  will  exist  between  plots  in  rates 
and  dates  of  planting  tests.  Hulbert  (1931)  presents  data  to  show  that  border  effect 
on  outside  rows  increases  as  the  rate  of  seeding  is  increased.  The  border  effect  on 
Bed  Bobs  wheat  was  lk'J  .85  per  cent  when  seeded  at  the  rate  of  three  pecks  per  acre, 
175.^1  per  cent  for  five  pecks,  and  I73.OI  for  seven  pecks.  Kiesselbach  (1918) 
tested  two  rates  of  planting  for  Turkey  wheat,  a  thin  and  a  thick  rate.  The  thin 
rate  yielded  68  per  cent  as  much  as  the  thick  rate  when  grown  in  alternate  single- 
row  plots,  and  90  per  cent  as  much  when  grown  in  alternate  five-row  plots.  Competi- 
tion between  alternate  single-row  plots  for  two  rates  for  Kherson  oats  caused  the 
thin  rate  to  yield  20  per  cent  too  low  in  1913  and  3^.3  per  cent  too  low  in  191^. 
Nebraska  White  Prize  corn  was  planted  in  alternate  rows  so  as  to  obtain  two  and  four 
plants  per  hill.  Due  to  competition  the  thin  rate  yielded  relatively  29. 0  and  9.0 
per  cents  too  low  in  different  years.  Similar  results  would  be  expected  in  date  of 
planting  tests.  Klages  (I928)  found  a  marked  degree  of  competition  in  spacing  tests 
with  sorghums.  Yields  of  rows  with  dense  stands  profited  at  the  expense  of  the  yields 
of  adjacent  rows  with  thinner  stands.  The  degree  of  competition  was  influenced  by 
environmental  conditions. 
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XII.  Border  Effect 

Plants  that  grow  along  the  aides  and  ends  of  plots  are  often  more  thrifty  and  vigor- 
ous than  those  in  the  interior.  This  is  particularly  true  when  the  plots  are  sur- 
rounded by   alleys.  Border  effect  is  considered  here  to  mean  the  effect  of  "blank 
alleys  on  the  "border  rows.  The  amount  and  extent  of  this  border  effect  is  important 
in  comparative  crop  tests. 

(a)  Small  Grains 

Amy  and  Hayes  (1918)  and  Amy  (1921)  (1922)  studied  (1)  the  distance  alley 
effect  is  operative  within  plots,  (2)  the  increase  in  yield  duo  to  alley  effect,  and 
(3)  the  influence  of  additional  alley  space  on  variety  response.  They  used  small 
grain  plots  composed  of  16  drill  rows  six  inches  apart .  The  yields  of  the  "border  row 
wero  compared  with  those  of  the  center  rows.  Amy  (1921)  gives  some  typical  data: 


0  at  s  Who  at  B  ar  1.  e  y 

Description  Bu.    Pet.        Bu^.    Pet.  Bu.       Pet. 


Outside  border  rows  65.58  199-9  30,56  153.6 

Middle  border  rows  ■  58.53  IJOik  25.75  127.1 

Inside  border  rows  ^9-95  1^2 .3  22.23  111.7 

Central  rows  32. oO  100.0  19.90  100.0 


W.93 

.213.5 

k2,7k 

136.5 

35.56 

IkS.k 

22 .  92 

100.0 

As  an  average  for  three  years  the  yields  of  outside  rows  of  oats,  spring  wheat,  and 
barley  expressed  In  per  cent  based  on  the  yields  of  the  central  rows  is  199-8  and 
that  for  the  middle  rows  I38.O  when  the  plots  were  surrounded  by  l8~Inch  clean-culti- 
vated alleys-  Border  effect  was  relatively  unimportant  when  extended  to  the  third 
drill  row.  Knowledge  that  border  effect  is  not  uniform  precludes  the  use  of  any 
percentage  figures  derived  in  one  place  to  reduce  yields  secured  in  another  location 
to  a  border-effect -free  basis.  Arnj  (1921)  further  showed  that  the  rank  of  a  variety 
may  be  changed  due  to  "border  effect.   In  all  cases,  plot  yields  were  higher  than 
where  these  rows  were  eliminated  before  harvest.  Hulbert  end  Eemsburg  (1927)  found 
it  necessary  to  discard  two  border  rows  from  each  side  of  small  grain  plots  to  remove 
the  error  in  border  effect  in  variety  tests.   Competition  effects  were  noticeably 
increased  when  the  adjacent  plots  were  seeded  at  different  rates.  Hulbert,  et  al. 
(1931)  obtained  similar  results.  Robertson  and  Koonce  (193*0  studied  border  effect 
on  Marquis  wheat  grown  in  plots  irrigated  at  different  stages  in  its  relationship  to 
yield  when  different  numbers  of  border  rows  were  included.  The  yield  increased  as 
the  size  of  plot  increased  but  the  percentage  increase  was  uniform  for  the  three 
different  treatments  employed.  Comparable  yields  were  the  same  for  plots  of  10  rows, 
and  for  10  plus  2,  k,    or  6  border  rows. 

(b)  Other  Crops 

In  kafir  and  milo,  Cole  and  Hallsted  (1926)  obtained  marked  increases  in 
yield  from  outside  rows.  The  excess  yield  was  roughly  proportional  to  the  increased 
available  soil  area.  Recently,  Conrad  (1930  has  called  attention  to  the  fact  that 
sorghum  plants  next  to  uncropped.  areas  may  use  soil  moisture  six  feet  away  laterally. 
A  definite  use  of  nitrates  was  made  four  feet  away  laterally  for  both  sorgo  and  corn. 
The  influence  of  border  effect  on  total  dry  matter  per  plot  was  studied  at  the  Cen- 
tral "Experimental  Farm  (Ottawa)  by  McRostrie  and  Hamilton  (15-7).   In  all  cases, 
border  plants  of  Western  rye  grass  gave  an  increased  yield  due  to  the  influence  of 
the  two-foot  pathway  which  surrounded  the  plots.  The  increase  in  yield  differed  with 
the  strain  under  test,  and  varied  from  6  to  ^k   per  cent.  The  rank  of  the  strains  was 
materially  changed  due  to  the  wide  variation  in  border  yields.  When  theoretical  plots 
1/72.6-acre  in  size  were  used  for  red  clover  and  alfalfa  forage  yields,  Hollowell  and. 
Heusinkveld  (1933)  found  a  serious  experimental  error  in  yield  when  border  rows  wore 
included  in  the  harvested  plot.  Their  plots  were  composed  of  8,  12,  and  lo-inch 
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alleys.  The  Inclusion  of  border  rows  increased  the  yield  from  2.1  per  cent  to  20.0 
per  cent  for  red  clover  and  from  1.8  to  lt.O  per  cent  for  alfalfa.  Border  effect  was 
greater  on  the  first  than  on  the  second  alfalfa  crop,  hut  varied  greatly  from  year  to 
year  under  Ohio  conditions.  Rainfall  appeared  to  he  directly  correlated  with  border 
effect.  These  investigators  concluded  that  the -discard  of  two  border  rows  would 
effectively  eliminate  border  competition  on  plots  of  this  size.  Similar  results  wero 
obtained  by  Tysdal  and  Kiesselbach  (1939)  when  they  compared  dissimilar  adjacent  al- 
falfa, plots  that  differed  as  to  spacing  of  rows  or  plants.  A  solid-drilled  block 
with  7 -inch  row  spacing  was  separated  by  a  7 -inch  alley  space  from  a  space -planted 
block  with  rows  2^-inches  apart.  The  adjacent  border  rows  were  compared  with  their 
respective  types  of  interior  rows.  The  solid-drilled  rows  gave  an  excess  yield  of 
'Jk   per  cent  because  of  reduced  competition  on  one  side,  whereas  the  space-planted 
row  was  depressed  63  per  cent  in  yield  because  increased  competition.  It  is  evident 
that  great  care  must  be  exercised  in  taking  yields  from  adjacent  rows  that  are 
affected  with  respect  to  row -space  or  density  of  stand. 

XIII.  Control  of  Inter -plot  Competition 

Inter-plot  competition  can  be  controlled  by  several  methods.  Hayes  and  Garber  (1927), 
Kiesselbach  (1918)  (1923)  and  others  give  these  recommendations:   (1)  group  varieties 
with  similar  growth  habits,  dates  of  maturity,  etc.,  together;  (2)  use  of  multiple- 
row  plots;  and  (3)  discard  outside  border  rows  and  ends  at  time  of  harvest.  .  Alleys 
are  sometimes  used  in  closely-sown  crops  such  as  small  grains  and  forage  crops  to 
facilitate  harvest  and  to  reduce  mixtures.   In  small  plots  the  borders  should  be  re- 
moved, but  in  large  field  plots  it  is  generally  satisfactory  to  harvest  the  entire 
plot  and  to  include  the  additional  alley  space  in  the  plot  area.  Untreated  inter- 
spaces of  sufficient  width  to  avoid  serious  soil  translocation  are  recommended  for 
permanent  soil  fertility,  rotation,  and  tillage  experiments.  These  alleys  can  either 
be  cropped  or  left  bare. 
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Questions  for  Discussion 

1 .  Why  should  pure  seed  \  ?.   used  in  variety  tests? 

2.  How  may  differences  Iin  acclimatization  introduce  errors  in  crop  tests?  How  can 
they  be  avoided? 

3.  When  or  with  what  crops  or  under  what  conditions  is  plant  individuality  a  factor 
to  be  considered  in  planning  experiments? 

1*.  When  is  moisture  content  of  the  crop  a  factor  of  importance?  How  may  the  error 
be  eliminated  or  corrected? 

5.  Could  you  secure  comparable  forage  yields  by  taking  green  weights?  Why? 

6.  What  are  the  advantages  of  the  vacuum  oven  over  an  ordinary  oven  for  securing 
moisture-free  weights? 

7.  Compare  rapid  moisture  determining  devices  for  cereals. 

8.  What  is  meant  by  plant  competition?  Who  have  emphasized  its  importance? 

9-  Is  competition  universally  present  in  experimental  plots?  Is  it  always  objec- 
tionable? Explain. 

10.  What  effect  does  severe  competition  have  on  plants? 

11.  How  can  you  reconcile  the  fact  that  some  workers  claim  plant  competition  is  a 
fruitful  source  of  error  in  experimental  work,  while  others  contend  it  is 
negligible? 

12.  Do  stand  irregularities  in  corn  affect  the  yield  so  long  as  the  same  number  of 
plants  per  unit  area  is  involved?  Explain. 

13.  Why  may  a  variable  stand  in  a  wheat  field  yield  as  much  as  an  evenly-spaced 
stand?  Explain. 

Ik,   Why  is  intra -plot  competition  in  small  grains  unimportant  from  the  practical 
standpoint? 

15.  What  is  the  general  effect  in  corn  hills  surrounded  by  hills  with  different  num- 
bers of  plants?  Why? 

16.  What  is  meant  by  "normally  competitive"  in  calculation  of  sugar  beet  yields? 

17.  What  is  the  effect  of  adjacent  blank  hills  on  the  weights  of  individual  beets? 

18.  Under  what  conditions  may  yields  from  "competitive"  beets  result  in  errors  in 
yield? 
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19.  What  recommendations  would  you  make  as  to  correcting  for  uneven  stands? 

20.  What  is  the  general  practice  for  the  prevention  of  errors  due  to  uneven  stands 
in  .corn  and  sorghums? 

21.  How  are  stand  errors  generally  corrected  in  corn  plots? 

22.  What  is  meant  by  "border  competition? 

23.  How  may  errors  be  introduced  in  variety  tests  by  use  of  single-row  plots?' 
2k.   How  could  you  .possibly  justify  two-row  plots  in  cotton  variety  tests  with  no 

borders  removed?  Single-row  alfalfa  plots? 

25.  How  does  competition  introduce  errors  in  rate  and  date  tests? 

26.  Under  what  conditions  may  it  be  desirable  to  have  blank  alleys  surrounding  plot: 
2V.  What  influences  do  border  rows  have  on  plot  yields? 

2o.  Is  it  always  necessary  to  remove  borders  for  the  determination  of  plot  yields? 

Why?  '■•-.': 

29.  What  recommendations  would  you  make  for  the  control  of  inter -plot   competition? 


Problems 

1.. Explain  how  to  arrange  and  conduct   an'  experiment  with   10  varieties   of   corn  so  as 
to   control  both  intra  and   inter-plot   competition. 

2.  The  yield  of  field-cured  hay  on  a  l/lu-acre  plot   is  *K30  lbs.     The  shrinkage  sample 
taken  at  that  time  weighed  3.8  lbs.      After  3  weeks   it  weighed  3 .k  lbs.     Calculate 
the  yield  per  acre   of  the  plot   on  an  air-dry  be  sis. 

3-  The  yields    (marketable   ears)    and  stands   of    6  strains   of  sweet   corn  for  k  replica- 
tions were  as  fellows:     Data  from.  Malionoy  and  Bat  en. 


Yie 

Id  and  Stand  for 

Strain  Number: 

Item 

1 

2                     3 

4 

5 

0 

Replication  1 . 

Held   (x) 

56 

31                      21 

23 

30 

■   60 

Stand   (y) 

77 

68                      61 

83 

70 

3)4 

Replication  2 . 

Held  (x) 

6k 

29                      32 

20 

59 

30 

Stand   (j) 

30 

76                       72 

38 

39 

92 

R epl i cat 1 on  3 • 

Yield   (x) 

36 

30                      2k 

18 

60 

>+7 

Stand   (y) 

7* 

.83.                      82 

7-8 

78 

78 

Replication  k. 

Yield   (x) 

36 

32                        2k 

19 

39 

30 

Stand    (y) 

•57 

6l                    73 

78 

32 

38 

Yield  Totals 

212 

122                     101 

80 

208 

2U7 

Stand  Totals 

d.  ju 

268                 283 

327 

289 

31+2 

Calculate  the  regression  of  yield  on  stand. 


CHAPTER  XV 
DESIGN  OF  SIMPLE  FIELD  EXPERIMENTS 

I.  Criticisms  of  Agronomic  Experiments 

There  are  about  2300  agronomic  projects  in  force  in  the  different  state,  experiment 
stations,  "besides  those  carried  on  by  the  U.S.  Department  of  Agriculture,  and  those 
in  related  fields.   In  fact,  two-thirds  of  all  agricultural  experimental  projects  in 
this  country  are  agonomic.  They  have  increased  in  number  "by  50  per  cent  since  1920. 
Frequently,  this  experimental  work  is  criticised  "by  farmers  and  others.  The  criticise 
may  or  may  not  be  justified.  Agriculture  is  sometimes  looked  upon  as  a  "practical" 
field  in  which  results  are  sought  rather  than  knowledge  concerning  the  phenomena  of 
life.  At  other  times,  there  is  a  genuine  shortcoming  in  experimentation.  Allen 
(1930)  states  that  fully  one-half  of  the  agronomic  experimental  projects  consist  of 
tests  and  trials  of  different  kinds.  Very  littie  ingenuity  is  involved  in  many  of 
them.  Variety  and  cultural  experiments  are  popular  while  many  genetic  studies  are 
merely  field  selection.  Soil  fertility  experiments  are  often  shallow.   In  many  cases, 
old  methods  of  experimentation  are  used  while  in  others  the  experiments  are  carried 
too  long. 

A  --  Easic  Principles  in  Design 

II .  Outline  of  Experimental  Tests 

A  review  of  literature  on  the  subject  should  "be  the  first  step  in  the  plans  for  an 
experiment.  This  should  "be  followed  "by  a  detailed  outline  in  order  to  crystallize 
the  ideas  of  the  investigator  on  the  subject.  Recently,  Fisher  (1937)  has  shown  that 
design  of  an  experiment  is  inseparable  from  the  statistical  analysis  of  the  data. 

Certain  objectives  must  be  kept  in  mind  in  all  agricultural  experiments.  These  may 
be  enumerated  as  follows:   (1)  The  tests  should  furnish  a  basis  for  recommendations 
to  farmers;  (2)  They  should  furnish  occular  proof  of  the  beneficial  results  attained; 
and  lastly,  (3)  They  should  supply  information  on  the  fundamental  causes  of  the 
phenomena  which  the  results  are  expected  to  demonstrate. 

Several  factors  need  to  be  considered  in  the  outline  of  an  experiment.  These  are 
well  described  by  Allen  (1930) :   (1)  It  should  be  definite  and  limited  in  scope. 
(2)  The  problem  should  be  subjected  to  competent  persons  for  criticisms  and  sugges- 
tions.  (3)  Previous  work  on  the  subject  should  be  familiar  so  that  the  investigator 
can  start  work  where  others  left  off.  (h)   Next,  he  should  ascertain  the  data  essen- 
tial to  the  problem  and  devise  means  to  secure  and  analyze  them.   (5)  Then  it  remains 
to  test  their  applicability  or  sufficiency  to  the  problem.  Sometimes  it  is  found 
that  progress  is  dependent  upon  the  advance  in  related  sciences. 

III.  Principle  of  the  Extremes 

Results  should  be  secured  over  a  wide  range  on  either  side  of  the  optimum.  Staple - 
don  (1931)  bas  made  this  statement:   "I  believe  in  all  field  experiments  of  a  re- 
search nature  we  should  go  at  each  end  far  beyond  what  is  deemed  by  practical  men 
to  be  the  economic  limit."  The  situation  may  be  illustrated  in  a  rate  of  seeding 
test  for  wheat  whore  the  optimum  rate  is  approximately  5  pecks  per  acre.  The  pre- 
liminary test  should  include  rates  at  regular  intervals  from  the  very  lowest  to  a 
maximum  well  beyond  the  point  where  the  optimum  is  expected  to  fall,  e.g., 
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1  peck  5  PRcks  10  pecks 

minimum  optimum  maximum 

The  size  of  interval  in  tests  is  determined  "by  the  amount  of  land,  facilities, 
character  of  the  problem,  or  available  finances.   An  increase  in  the  size  of  the  in- 
terval is  justifiable  as  one  goes  from  the  optimum  to  either  a  minimum  or  to  a  maxi- 
mum.  In  the  final  test,  it  may  he  advisable  to  throw  out  the  extremes  ami  conduct  a 
precise  test  around  the  optimum  rate. 

IV.  Simple  vs.  Complex  Experiments 

Experiments  may  he  classified  into  several  kinds  based  on  the  number  of  factors 
studied  at  the  same  time.  The  formal  experiment  is  sometimes  preceded  by  a  prelimi- 
nary test. 

(a)  Preliminary  Tests 

All  preliminary  experiments  are  necessarily  empirical  in  nature.  They  give 
the  investigator  an  opportunity  to  detect  faulty  technique,  inadequate  methods,  etc. 
The  final  experiment  can  be  planned  to  eliminate  many  of  the  shortcomings  observed  in 
the  preliminary  test.  A  survey  is  sometimes  used,  for  a  preliminary  test.  A  further 
use  of  the  preliminary  experiment  is  to  reduce  the  error  in  subsequent  tests.   (See 
Wishart  and  Sanders,  1935) • 

(b)  Simple  Experiments 

One  thing  is  studied  at  a  time  in  the  simple  experiment.   All  factors  are 
kept  constant  or  uniform,  so  far  as  possible,  except  the  one  under  investigation. 
This  is  the  classical  method  of  experimentation,  i.e.,  the  essential  conditions  are 
varied  only  one  at  a  time.  R.  A.  Fisher  has  recently  pointed  out  that  this  approach 
is  inadequate  for  many  research  problems  because  the  lavs  of  nature  may  be  controlled 
and  influenced  by  several  variables.   In  his  book  on  i:The  Design  of  Experiments !I, 
Fisher  (1937)  makes  this  statement:   "We  are  usually  Ignorant  which,  out  of  innumer- 
able possible  factors,  may  prove  ultimately  bo  be  the  most  important,  though  we  may 
have  strong  presuppositions  that  some  few  of  them  are  particularly  worthy  of  study. 
We  have  usually  no  knowledge  that  any  one  factor  will  exert  its  effects  independently 
of  all  others  that  can  be  varied,  or  that  its  effects  are  particularly  simply  related 
to  variations  in  these  factors".  The  simple  experiment  is  justified  when  the  time, 
material,  or  equipment  are  too  limited  to  allow  for  attention  on  more  than  one  narrow 
aspect  of  the  problem.  As  an  illustration  of  this  type,  an  experiment  can  be  set  up 
to  determine  the  best  variety  of  sugar  beets  to  grow.   Another  could  be  designed  to 
determine  the  best  fertilizers  to  apply,  while  a  third  separate  experiment  could  be 
relegated  to  the  best  cultural  practices.  The  simple  experiment  is  the  one  most  com- 
monly used  by  investigators.   It  is  recommended  to  beginners  because  it  is  less  in- 
volved . 

( c )  Combination  and  Complex  Experiments 

More  than  one  variable  is  studied  at  a  time  in  combination  experiments. 
Examples  of  some  of  the  more  simple  experiments  of  this  type  are:   (1)  rate  and  date 
of  planting  tests,  (2)  the  relation  between  time  of  planting  and  date  of  maturity, 
(3)  depth  and  rate  of  planting,  in  relation  to  yield,  (k)    fertilizer  tests,  etc. 
Recently,  the  Rothamsted  workers  have  advocated  the  complex  experiment  in  which  two 
or  more  treatments  are  studied  in  all  possible  combinations.  Yates  (1935)  states 
complex  experimentation  is  due  primarily  to  R.  A.  Fisher  who  first  suggested  it  in 
1926.   It  is  extensively  practiced  at  Rothamsted  and  to  a  Lesser  extent  elsewhere. 
Fisher  (1937)  claims  two  advantages  of  the  complex  experiment  (factorial  arrangement) 
over  experiments  that  involve  single  factors,  viz.,  greater  efficiency  and .greater 
comprehensiveness .  A  further  advantage  is  that  a  wider  inductive  basis  for  conclu- 
sions is  available.  As  an  example,  a  complex  experiment  could  be  set  up  to  determine 
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the  responses  of  several  fertilizers  and  methods  of  land  preparation. 

V.  Replication 

As  previously  pointed  out,  soil  heterogeneity  is  the  principal  source  of  error  in 
the  field  experiment .  It  can  he  overcome  theoretically  "by  replication  which  tends 
to  diminish  the  experimental  error  as  well  as  to  provide  for  an  estimate  of  the  mag- 
nitude of  such  errors.  Fisher  (1931)  gives  a  diagram  to  show  these  relationships: 


Replication 


II 

Random  Distribution 


skill 

Local  Control 


Validity  of  estimate 
of  error 


Diminution 
of  error 


(a)  Relation  to  Soil  Heterogeneity 

The  decrease  in  the  standard  error  of  the  mean  of  one  variety  or  treatment 
is  proportional  to  the  square  root  of  the  number  of  replications.  Some  workers  have 
argued  that  increased  replication  results  in  more  heterogeneity  due  to  the  occupation 
of  a  larger  land  area  with  the  result  that  a  point  will  he  reached  "beyond  which  fur- 
ther replication  will  give  no  further  increase  in  accuracy.  Fisher  (1931)  points  out 
that  the  experimental  error  is  due  only  to  the  Irregularities  within  blocks  and  that 
this  difficulty  is  not  effective  when  different  treatments  are  compared  locally  with- 
in relatively  small  pieces  of  land.  The  number  of  blocks  or  replicates  makes  no 
difference  because  the  block  effect  may  he  removed  by  the  experimental  arrangement 
(e.g.  randomized  blocks  and  Latin  squares).  Large  "blocks  presents  a  problem  in  it- 
self. The  situation  of  large  blocks  led  Hayes  (I923)  to  make  the  statement  that, 
when  a  large  number  of  strains  are  "being  tested,  it  is  necessary  to  use  a  large  num- 
ber of  replications  to  attain  the  same  degree  of  accuracy  as  when  a  smaller  number  of 
strains  are  "being  compared.  Special  designs  are  advisahle  for  tests  of  a  large  num- 
ber of  varieties  or  treatments. 

(b)  Duration  of  Tests 

Replication  in  time  is  a  necessary  consideration  in  experimental  tests. 
Comparative  results  from  various  treatments  or  varieties  are  frequently  modified  or 
even  reversed  in  different  seasons  in  response  to  climatic  and  soil  variations  and 
to  the  prevalence  of  plant  diseases,  insects,  and  other  pests.  The  American  Society 
of  Agronomy  (1933)  recommends  the  continuation  of  a  field  experiment  over  a  number  of 
years  so  as  to  give  a  random  sample  of  such  seasonal  effects.  As  an  illustration, 
seasonal  variahility  at  the  Hays  (Kansas)  substation  is  greater  than  that  due  to 
soil. 


Crop 


Variable  Factor 


Acre  Yield  (Bu.) 


"Wheat 
Wheat 


Season 
Soil 
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I     p    p 
The  standard  deviation  due  to  soil  end  season  would  tie:   s1  -yl2~   +  V"  .  T-ae  re_ 
duction  in  seasonal  variation  would  require  a  replication  of  the  test  over  a  greater 
number  of  years.   Ordinarily ,  a  minimum  of  3  years  should  be  required  in  a  field  ex- 
periment where  a  seasonal  influence  is  important. 

VI .  P 1 ot  Arrangement  s 

Each  variety  or  treatment  may  he  arranged  either  (i)  in  the  same  order  in  each  repli- 
cate, or  (2)  entirely  at  random  in  each  replicate.  The  former  is  called  a  systematic 
distribution  while  the  latter  is  designated  as  a  random  arrangement.  Until  rather 
recently ,  systematic  distributions  have  been  generally  used  in  field  experiments. 
Random  arrangements  have  been  advocated  by  Fisher  (19:31)  (1937)  and  the  Eothamsted 
workers  who  claim  that  randomization  is  necessary  for  a  valid  estimate  of  error.  Re- 
gardless of  the  arrangement  used;  the  various  plots  of  a  variety  or  treatment  should 
be  arranged  so  as  to  adequately  sample  the  experimental  area.  This  usually  leads  to 
certain  restrictions  on  the  arrangement. 

(a)  Random  Arrangement s 

To  justify  random  arrangements ,  Fisher  (1931)  states  that  uniformity  trials 
have  quite  generally  established  the  fact  that  soil  fertility  cannot  be  regarded  as 
distributed  at  random  but  to  seme  extent  systematically.  As  an  average,  nearby  plots 
are  known  to  be  more  alike  than  those  farther  apart.  Moreover ,  soil  fertility  dis- 
tribution is  seldom  or  never  so  systematic  that  it  could  be  represented  ~bj   a  single 
mathematical  formula.  As  to  the  estimate  of  error,  Goulden  (1931)  explains  that  it 
depends  upon  differences  in  plots  treated  alike.   Such  an  estimate  will  be  valid  only 
when  pairs  of  plots  treated  alike  are  not  nearer  together  or  farther  apart  than  pairs 
of  plots  treated  differently.  The  total  variance  is  made  up  of  differences  between 
plots  in  both  directions.  When  the  differences  between  plots  treated  differently  are 
reduced  by  any  sort  of  systematic  arrangement  one  must  automatically  increase  the 
differences  between  plots  treated  alike,  and  vice  versa,  e.g. 

V  (total  variance)  ~   A  (plots  treated  alike)  +  B  (plots  treated  differently) 

An  alteration  in  either  A  or  B  will  result  in  a  similar  alteration  in  the  opposite 
direction.  Systematic  arrangements  which  attempt  to  distribute  the  plots  of  any  one 
variety  or  treatment  as  widely  as  possible  over  the  experimental  area  tend  to  reduce 
B  and  increase  A.  Thus,  the  real  differences  between  varieties  or  treatments  are 
reduced  and  the  experimental  error  increased*  An  example  of  a  random  arrangement  for 
6   "varieties"  in  h   replicates  is  as  follows: 

Replicate  I:  5-7-2 -lf-8-6 -3-1 

Replicate  II:  1-3-5-6-2-8.-7 -k 

Replicate  III:  l_C-2-3-p-7-6-k 

Replicate  TV:  7  _k -2-1-8 -3-5-6 

In  practice,  a  set  of  random  numbers  such  as  those  compiled  by  Tippett  (1927)  is  use- 
ful to  effect  randomization  of  treatments  or  varieties.  One  may  draw  numbered,  chips 
at  random  or  shuffle  cards  to  obtain  a  random  arrangement. 

(b)  Systematic  Arrangements 

A  systematic  arrangement  is  the  repetition  of  the  varieties  in  the. same 
order  in  each  replicate.   Correlation  between  adjacent  varieties  is  likely  under  such 
arrangements.  However,  systematic  arrangements  may  be  more  practical  in  some  experi- 
ments. Certain  advantages  have  been  given  "oj   the  advocates  of  systematic  arrange- 
ments:  (1)  Simplicity.  It  facilitates  planting,  harvesting,  and  note-taking  opera-  • 
tions.   (2)  It  provides  adequate  sampling  of  the  soil,  i.e.,  allows  for  "intelligent 
placement"  of  the  various  -varieties  or  treatments,   (3)  Varieties  may  be  arranged  in 
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the  order  of  maturity  so  as  to  facilitate  machine  harvest  of  field  plots,  (k)   It  may- 
he  desirahle  to  alternate  dissimilar  varieties  ("bearded  and  "beardless)  so  that  mechan- 
ical mixtures  can  he  detected  in  subsequent  years.  Systematic  arrangement  may  he 
effective  in  such  cases.  Thru  the  use  of  plots  which  provide  for  the  elimination  of 
plant  competition  effects,  systematic  distribution  loses  one  of  its  most  serious 
sources  of  systematic  error. 

The  plot  scatter  on  the  experimental  area  is  a  matter  of  simple  repetition 
when  the  plots  are  all  planted  in- a  single  series,  viz., 

Replicate  I  Replicate  II  Replicate  III 


ABCLEFGH  ABCDEFGH         ABCDEFGH 

As  a  rule,  all  plots  cannot  he  placed  exactly  in  one  series,  i.e.,  there  are  either 
too  few  or  too  many.   It  is  advisable  to  commence  each  block  with  a  different  variety, 
especially  when  there  is  a  soil  gradient  in  the  same  direction  as  the  series.  This 
eliminates  the  possibility  that  one  variety  will  fall  on  the  best  soil  in  each  block. 
For  compact  blocks,  the  knight's  move  (one  down  and  two  over)  is  a  common  arrangement 
to  secure  an  adequate  scatter,  viz., 

Replicate  Varieties 


A  B  C  D  S  F 

G  H 

G  H  A  B  C  I 

I  F 

EFGEAB 

0  D 

I 

II 
III 

(c)  Influence  of  Arrangement  on  Error 

Few  data  are  available  to  show  the  relative  accuracy  of  systematic  and  random 

arrangements.  The  "Student" Fisher  controversy  in  1936  indicates  that  the  problem 

has  not  been  fully  settled.  In  a  comparison  of  diagonal  with  random  arrangements, 
Tedin  (1931)  found  that  the  degree  of  variability  within  6  by  5  blocks  was  not  in- 
fluenced by  either  arrangement  in  the  estimate  of  error.  However,  he  advised  random 
arrangements  for  the  highest  degree  of  scientific  accuracy.  In  studies  from  uniform- 
ity trials  with  rice,  Pan  (1955)  concluded  that,  with  a  systematic  arrangement  of 
varieties,  the  deviations  from  mathematical  expectation  were  too  -great  to  be  explained 
on  the  basis  of  random  sampling.  In  a  randomized  arrangement,  the  number  of  differ- 
ences in  yield  between  all  possible  comparisons  of  hypothetical  varieties  that  fell 
within  a  range  of  0.5  cr,  1 .0  <r,   etc.,  were  computed.  Satisfactory  agreement  with 
mathematical  expectation  was  obtained  in  two  experiments,  and  poor  agreement  in  one 
(P  =  les3  than  0,01).  On  the  other  hand,  Odland  and  Garber  (1928)  obtained  somewhat 
lower  standard  deviations  from  systematic  arrangements  than  from  the  theoretical  ran- 
dom arrangement.  So  far  as  small  grains  in  nursery  plots  are  concerned,  Love  and 
Craig  (1938)  found  the  relative  yields  to  be  about  the  same  for  systematic  and  random 
arrangements . 

VII.  Error  Control 

The  differences  between  plots  of  a  single  treatment  in  a  replicated  experiment  are 
due  partly  to  experimental  error  and  partly  to  the  average  differences  between  repli- 
cates. The  variability  between  replicates  is  irrelevant  to  the  experimental  test 
when  each  variety  or  treatment  occurs  but  once  in  a  replicate.  Therefore,  the 
variance  due  to  replicates  or  blocks  is  generally  removed  from  the  error.  The  pre- 
cision of  the  experiment  becomes  greater  when  a  large  amount  of  the  total  variability 
can  be  removed  in  this  way. 


is- 
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The  shape  of  plots  and  "blocks  are  also  concerned  in  error  control.  Long  narrow  plots 
are  preferable  within  the  "block  so  long;  as  the  blocks  themselves  approach  a  square  in 
shape.  The  basic  experimental  designs  are  the  randomized  block  and  Latin  square 
arrangements.   (See  Goulden,  1939)  •' 

VIII.  Randomized  Blocks 

The  randomized  block  test  is  the  simplest  type  of  experiment  where  satisfactory  error- 
control  is  obtained.  This  type  of  design  is  extremely  flexible  and  can  be  used  for 
as  many  as  30  treatments.  The  principal  restriction  Ln  this  test  is  that  the  same 
treatment  should  fail  only  once  in  each  "block,  the  treatments  or  varieties  being 
arranged  at  random.  The  number  of  replicates  or  blocks  depends  somewhat  upon  the 
number  of  treatments  included  in  a  block  and  the  degree  of  precision  desired.   It  if 
preferable  for  'the  test  area  to  be  square  in  shape,  altho  this  is  not  absolutely 
necessary. 

(a)  Field  arrangement 
A  field  arrangement  for  10  varieties  in  t  blocks  could  be  as  follows:  \V 

I      5  10      7      2     t     8  9     6  3  1 

II      9  1823    10  3     7  6  t 

III      6  1      2      9     8    3  ■  ■  10     5  .  7  t 

_IV 3  7_    _3_    _t    _9_ 6_  _2 3_  _10_  _l_ 

For  more  than  30  varieties,  special  designs  should  be  used.   (See  later  chapters).. 

(b)  Computation  of  Sums  of  Squares 
The  yield  data  can  be  arranged  conveniently  for  computation  as  in  the  table 

below.   (Data  from  Goulden,  19-9) • 


Varieties 

Blocks      1      2     5      t     5     6     7  8  9    10  Totals 

I      34.0   16.0   jtvl   it. 5  lo„5  29.9  28.6  16.0  17.3  23.I  232.0 

II      lt.0   11.0   -20.5   13o  13-6  28,2  27.6  8.3  12.1  29.9  I8O.3 

III     26.6    9.0   29.3    7.9  13. ^  25.3  23.3  3.6  8.1  22,6  171.5 

IV      18.5   H.9   21.0   13.2   8.9  28.8  Lo.3  9-5  10.317.7138.3 


Totals    93.I   H7.9  lOt. 9   31.1  56, t  112. t  96.2  39. t  t7.8  92.9  7t2.1 
Means     23. 12   11. 98  26.22   12. 78  it. 10  28.10  2t. 03  9.85  11.95  23-22  18. 55 


First,  it  is  necessary  to  compute  the  sums  of  squares  for  octal.,  varieties,  blocks 
(or  replicates) ,  and  error.  The  correction  factor  is  (Sx)  /itf,  or  (?t2.l)  /to  = 

550, 712. ti /to  =  13,767.81. 

Total  =  S(x2)  -  (Sx)2  =  l6;279.27  -  13,767.81  =  2511. t6 

N 

Varieties  =  S(xv2)  -  (Sx)2  =  62 /lit .01  -  13,767.81  *  1760.69 
n       N  t 

In  this  case,  it  is  necessary  to  square  the  total  for  each  variety  and  divide  \rj   the 
number  of  values  that  make  up  each  total  to  reduce  the  results  to  a  single-plot  basis. 

v  A  set  of  random  numbers  such  as  table  6  in  the  appendix  is  useful  to  randomize  the 
varieties.   In  fact,,  columns  I,  III,  V,  and  VII  were  used. 
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Blocks  ■-S(ftbg)  -  (Sx)2  =  l4o,803.25  -  13,767.81  =  312. 51 
m  K  10 

Error    =     Total  -  (varieties  +  blocks) 

=     25U.W  -  (1760.69  +  312.51)  =  438.26 

The  data  are  assembled  in  a  convenient  table  as  follows : 

Variation  Sums  Mean 

due  to  D.F.  Squares  Square        s      F -value 

•  Blocks  3  312.51  104.17  6.1+2** 

Varieties  9  1760.69  195.63  12.05** 

Error  27  438.26  16.23     4.029 

Total  39  .         2511.46 

**Exceeds  1.0  per  cent  point,  i.e.,  the  vaxue  of  "F"  which  has  a  probability  of  0,01 
of  occurring  due  to  chance. 


F  =  larger  variance  =  195.63    =  12 . 05 
smaller  variance     16.23 

By  reference  to  the  F -table,  it  is  observed  that  the  obtained  F-value  exceeds  the 
1.0  per  cent  point  in  both  cases. 

The  other  computations  are  as  follows: 


Standard  error  of  a  single  determination  (s)  =  V16.23  =  4.029 

Standard  error  of  the  mean  for  each  variety  (o^)  =  s//n  =  4.029  /V~^~=  2.0143 

Standard  error  of  a  difference  (crd)  =  05^/2"  =  (2 .0143) (1 .l4l4)  =  2.8486 

Level  of  significance  for  5  pet.  point  =  (crd)(t)  (for  27  d.f .) 

=  (2. 8486) (2. 052)  =  5-8453 

In  this  case,  2.052  times  the  standard  error  of  the  difference  gives  odds  of  19:1. 
This  value  can  be  obtained  from  the  "t"  table  by  Fisher  (1934)  where  "t"  is  taken  for 
the  degrees  of  freedom  for  error  at  the  5  per  cent  point . 

(c)  Application  to  Mean  Comparisons 

Tests  are  sometimes  found  in  which  the  value  of  z  or  F,  for  the  comparison  of 
variances  due  to  varieties  and  error,  just  fails  to  reach  the  5  per  cent  level  of 
significance.  This  would  indicate  that  the  differences  between  variety  means  were  of 
doubtful  significance.   In  spite  of  this,  certain  differences  between  variety  means 
can  often  be  found  which  exceed  twice  the  standard  error.  The  use  of  twice  the 
standard  error  (which  gives  approximately  edds  of  19:1  as  the  degrees  of  freedom 
approach  60)  would  indicate  that  certain  differences  might  be  sign?f icant .  However, 
in  such  cases  the  testimony  of  the  "z"  or  "F"  test  should  be  accepted  as  correct. 
Twice  the  standard  error  is  net  a  sufficiently  stringent  test  for  the  comparison  of 
the  greatest  yield  difference  found  in  a  large  set  of  possible  differences.  Student 
(1927)  and  Tlppett  (1937)  have  both  pointed  out  that,  when  the  highest  and  lowest 
values  are  compared,  the  conventional  use  of  twice  the  standard  error  to  obtain  odds 
approximately  equivalent  to  the  5  per  cent  level  of  significance  is  no  longer  valid. 
For  example,  with  10  varieties  in  the  test  the  difference  between  the  highest  and 
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lowest  varieties  would  need  to  reach  3 .2   times  the  standard  error  to  lie  on  the  5 
per  cent  level  of  significance.   On  the  other  hand,  when  !!F"  is  determined  signifi- 
cant, the  practice  of  using  twice  the  standard  error  of  the  difference  of  two  means 
as  a  criterion  for  significance  may  "be  too  stringent  when  the  means  under  considera- 
tion are  contiguous  in  an  arrangement  of  the  variety  (or  treatment)  means  in  order 
of  magnitude. 

IX .  The  Latin  Square 

The  Latin  square  design  is  very  efficient  whore  a  small  number  of  varieties  or 
treatments  is  heing  tested;  hut  It  "becomes  unwieldy  for  more  than  1.0.  Two  restric- 
tions are  imposed  on  the  treatments  In  this  design,  i.e.,  the  same  treatment  can 
occur  only  once  in  'the  same  row  or  column.  The  treatments  are  arranged  at  random 
within  these  restrictions.  The  limitation  of  the  Latin  square  for  a  large  number  of 
varieties  is  due  to  the  requirement  of  the  same  number  of  replications  as  treatments, 
It  should  he  emphasized  that  the  plots  need  not  he  square  in  shape.   (See  Fisher  and 
Wis hart,  1930).  This  design  gives  error  control  across  the  field- In  two  directions, 
which  always  takes  care  of  soil  gradients,  The  most  generally  used  Latin  squares 
vary  from  k   by  k   to  10  by  10.  Some  data  from  an  irrigation  study  with  sugar  "beets 
will  be  used  as  an  illustration  of  the  Latin  square  arrangement  in  the  field  as  well 
as  for  the  statistical  analysis. 

(s.)  Field  Plot  Arrangement 

The  field  lay-out  for  the  3  irrigation  treatments  (A,B,C,B,  and  E)  was  as 
follows ; 


Columns 

1 

1 

2 

3 

k 

5 

• 

E 

D 

A 

B 

0 

d. 

C 

E 

B 

A 

B 

Rows 

3 

A 

C 

B    . 

E 

D 

k 

D 

B 

E 

C      . 

A 

of 

the 

B 

A 

C 

B 

Tf> 

Analysis 

Data 

The  data  for  the  irrigation  stud;/  are  compiled  below,  followed  by  the  static- 
t  i  cal  analy s is . 


Tons  I 

>ee1 

;s  Per 

Acre 

Row 

Row 

1 

o 

7 

i 

(• 

5 

Totals 

1 

18. 

32 

i'E) 

19 

.ko 

(T» 

20. 

66 

(A) 

22 

63 

(B) 

18 

.65 

(c) 

99 

.97 

2 

20. 

68 

(c) 

Ik 

•  29 

(B) 

18. 

32 

(B) 

20. 

02 

(A) 

20 

.58  (B) 

9'+ 

39 

3 

26 

ok 

(A) 

IT 

M 

(c) 

21. 

06 

(B) 

18 

91 

(s) 

20 

.03 

(B) 

103 

53 

k 

22 

31 

(B) 

dd 

•  93 

(B) 

17". 

15 

(E) 

17 

lk 

(G) 

20 

.62 

(A) 

100 

ko 

5 

2k 

kk 

(B) 

20 

.2L; 

(A) 

lo. 

92 

(c) 

19 

73 

(B) 

lk 

.07 

(E) 

97 

kl 

Column 

Totals 

112. 

19 

,9k 

•kl 

96. 

61 

98 

kd 

93 

95 

14-95 

70 

Treatment 

A 

B 

G 

B 

I 

Totals 

107. 

59 

111 

■Ik. 

92.88 

100 . 

35 

82. 

?k 

Means 

22. 

35 

23 

.52 

20.11 

18. 

38 

16. 

59 
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Correction  factor  =     (Sx)2/n     =     (495.70)2/25     =  9828.7596 


Total  =  S(x2)    -  (Sx)2     =  10,007.8598  -  9.328.7596  =  179.1202 

S 

Rows  =  S(xr2)  -  (Sx^  =  9,858.1604  -  9,828.7596  =  9.4208 
n      K 

Columns  =  S(xc2)  -  (Sx)2  =  9,875-9164  -  9,828.7596  =  4?.1768 
n      N 

Treatments  =  S(xfc2)  -  (Sx)2  =  9,955-4952  -  9,823.7596  =  IO6.7556 
n      N 

Error  =  Total  -  (Rows  +  columns  +  treatments) 

=  179.1202  -  (9.4208  +  45.1768  +  106.7556)  =  17.7670 

The  data  are  assembled  to  complete  the  analysis: 


Variation 
due  to 

D.F. 

Sums 
.  Squares 

Mean 
Square 

Standard 
Error 

F- 
Actual 

-Value 
Jfo   Point 

Rows 
Columns 
Treatments 
Error 

4 

4 

4 

12 

9.4208 

45.1768 

106.7556 

17.7670 

2.5552 
11.2942 
26.6889 

1.4806 

1.2168 

1.59 
7.65 

18.05 

5.26 
5.26 
5.26 

Total 

F  =  larger 

24 
variance   = 

26 

179.1202 
.6889  =  18.05 

smaller  variance       1.48o6 

Since  the  computed  F-value  is  greater  than  that  for  the  5  Ver   cent  point,  significant 
differences  exist  "between  treatments. 

The  other  constants  may  be  computed  as  follows: 

Standard  error  of  the  mean  (cr^)  =   s     =  1.2168  =  0.5440  tons. 

Standard  error  of  the  difference  (ad)  =  o^^lT  =  0.5440 -/2~  =  O.77 

Level  of  significance  (for  5  pet.  point)  =  2.179  ©a  =  (2 .179) (0-77)  =  1-68  tons. 

The  data  may  "be  arranged  as  follows  in  summary  form: 

Treatment Mean  Yield  (tons) 

A  22.55 

B  21.52 

C  20.11 

D  18.58 

E  16.59 


Standard  Error  of  the  Mean         0.544 
Level  of  Significance  (5$  point)    1.68 
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B  --  Relation  of  Type  of  Experiment  to  Design- 

%m   Variety  and  Similar  Teats 

The  variety  test  Is  probably  the  most  common  type  of  agronomic  field  experiment. 
Crop  varieties  are  bested  for  yield  in  moat  crop  improvement  programs  to  determine 
which  ones  are  superior  under  given  soil  and  climatic  conditions.  Varieties  are 
known  to  differ  as  to  the  "best  rate  of  planting.  Less  seed  is  required  under  dry  lane 
than  under  irrigated  conditions  due  to  the  moisture  factor.  Car let on  (1909)  points 
out  that  winter  wheats  tiller  more  than  spring  wheats  and;  when  winter -hardy,  may  be 
sown,  at  a  thinner  rate.   It  is  not  always  possible  to  overcome  the  objection  of 
differential  response  of  varieties  to  different  rates  of  seeding  in  a  -variety  test. 
It  is  usually  safer  to  use  a  somewhat  higher  rate  than  that  recommended  to  farmers 
because  variations  due  to  unexpected  causes  will  then  have  less  effect. 


Rate  and  date  tests  are  sometimes  combined  wiith  variety  trials,  or  they  may  be  con- 
ducted separately.  The  combined  test  permits  a  study  of  differential  -variety  response 
to  different  rates  or  dates.   A  rather  wide  range  of  rates  on  either  side  of  the 
optimum  is  suggested  for  rate  of  planting  tests  in  order  to  determine  the  point  of 
maximum  yield.   A.  test  to  determine  the  moat  satisfactory  dates  for  planting  crops 
is  usually  an  exploratory  stage  in  field  experimentation  to  secure  this  information 
for  certain  environmental  conditions.  Such  tests  are  usually  planted  at  a  regular 
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tremi 


fil^r 


It 


5  between  dates  from  extremely  early  in  the  planting  season  to  ex- 
)ue  to  occasional  differential  varietal  response  to  time  of  ■planting 


consideration  should  be  given  to  the  question  of  planting  a  variety  series  at  several 
different  dates. 


Ex 


perimenti 


the : 


;-;ned 


with  p  to  10  varieties,  while 
varieties.  For  greater  numbe; 
should  be  investigated. 

XI.  Crop  Sot at ion  Experiments 


as  Latin  squares  for  small  precise  tests 
randomized  blocks  are  commonly  used  l'or  10  to  JO 
s  in  a  single  experiment ,  incomplete  block  designs 


esidual  effects  is 


a  study  of  r 

to  '^row  all  crops  used  in  the  rotation  each  year  in  order  to 


In  crop  rotations;,  or  other  experiments  in  wh 

made,  it  is  neeessaip 

obtain  reliable  results.  Carieton  (19Q9)  early  called  attention  to  the  fact  that 

this  simple  but  essential  matter  had  been  entirely  overlooked  in  many  of  the  older 

experiments.  For  accuracy  in  a  rotation  aeries ,  every  stage  or  crop  mu 

every  condition.  Each  year  there  must  be  as  many  plot 

in  the  rotation.  For  example,  in 

(2)  red  clover,  (p)  corn,  and  (h) 


experience 
are  crops  or  stages 


as  the.ro 
k-jear  rotation  of   (1)    oats  seeded  to  red  clover, 
there  must  be  four  plots.     The  plots  must 


5X133', 


ox  over,  {3 )    corn,  an 
be  at  least  in  duplicate  in  order  to  allow  for  the  removal  of  soul  variability, 
quale  replication  is  the  greatest  need  in  crop  rotation  experiments.   In  another 
block  in  the  same  test  there  may  be  a  plot  of  each  crop,  in  continuous  culture,  al- 
though this  is  not  always  necessary.  The  crop  rotation  test,  must  be  concreted  over 
a  period  of  years  so  that  the  crop  yields  will  be  definitely  influenced  by  the  dif- 
ferent rotation  treatments.  Such  a  test  might  be  laid  out  as  follows: 


i+-year  Rotation 


Replicate  I 


Replicate  II 


(a) 

Red 

Clover 

(b) 

Corn    Barley 
(c)      (d) 

Corn     Oats 
(e)      (f) 

Continuous 

Culture 

••-;-  -ye  ar  Rot  at  i  on 

Oats 
it) 

Corn 
(0) 

Barley  Corn 
(g)      (c) 

Barley   Oats 
(d)      (a) 

iiture 

ar  I  ev 


(s) 


del  Clev 
(b) 
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To  compare  this  4-year  rotation  -with  a  3-year  rotation,    it  would,  he  necessary  to 
wait  12  years.     For  a  7  and  5-year  rotation,   the  results  could  he  compared  at  the 
end.  of  35  years,    etc. 

XII.  Cultural  Experiments 

Cultural  experiments  include  such  tests  as  fall  vs.  spring  plowing,  methods  of  seed- 
bed preparation,  surface  vs.  furrow  planting,  etc.  Field  plots  are  generally  neces- 
sary for  experiments  of  this  type  "because  of  the  use  of  farm  machinery.  Many  dryland- 
experiments  are  concerned  with  cultural  methods.  The  same  procedures  for  variety 
tests  are  generally  satisfactory  in  tests  of  this  kind. 

XIII.  Fertilizer  Experiments 

The  most  reliable  information  on  the  fertilizer  needs  of  soils  may  he  obtained  from 
the  field,  experiment.  Nutrient  solutions  and  sand  cultures  are  used  in  special 
studies.  The  early  fertility  experiments  at  Rothamsted  were  concerned  primarily  with 
the  fertilizer  value  of  certain  mineral  fertilizers  as  shown  by  increased  crop  yields. 
The  present  long-time  fertilizer  experiments  are  concerned  more  with  comparisons  of 
similar  fertilizers,  effects  on  crop  plants,  and  efficiency  of  fertilizer  practices. 
The  earlier  workers  often  tested  one  fertilizer  at  a  time,  but  many  present  workers 
are  inclined  to  favor  more  comprehensive  tests,  i.e.,  the  inclusion  of  several  fer- 
tilizers at  more  than  one  level.  Most  investigators  use  crop  yield  as  the  major 
criterion  n£  fertilizer  response. 

(a)  General  Types  of  Fertilizer  Tests 

Fertilizer  Tests  may  be  conducted  for  several  definite  purposes.   (I)  Defi- 
ciency of  Fertilizer  Elements  in  a  Field:  Results  of  such  tests  are  applicable  only 
to  the  field  tested  or,  at  most,  to  soil  of  similar  type  with  similar  previous  cul- 
tural treatment.  It  is  strictly  applicable  for  the  test  year,  since  the  crop  grown 
may  modify  conditions  for  the  next  season.   (2)  Efficiency  of  single  Fertilizer 
Elements;  For  this  type  of  test  it  is  desirable  to  have  the  fertilizer  elements 
tested  in  minimum.   (See  Giles,  191*0.  Several  rates  of  a  standard  fertilizer  can  be 
compared  with  one  or  more  rates  of  a  fertilizer  that  carries  the  elements  in  a  differ- 
ent form.  Equal  rates  of  each  fertilizer  can  be  compared  also.   (3)  C omparat ive 
Methods  of  Application;  This  type  includes  tests  on  depth  of  placement,  time  of 
application,  placed  to  side  vs.  with  the  seed,  etc.  (h)   Optimum  Fertilizer  Balance: 
This  type  is  concerned  with  fertilizer  balance  for  various  crops.  It  involves  many 
complications  when  made  in  the  field  because  it  is  difficult  to  control  or  even 
measure  the  fertilizer  balance  in  a  field.   In  such  a  study  it  is  necessary  to  esti- 
mate by  chemical  tests  the  amount  of  the  fertilizer  elements  furnished  by  the  soil  as 
well  as  the  amount  applied.  Probably  the  most  practical  method  to  make  such  a  study 
is  to  vary  each  element  separately  over  a  wide  range  qj   several  rates  of  application. 
The  regression  of  yield  on  amount  of  the  element  available  (amount  in  plant  plus 
amount  applied)  may  then  be  calculated.  A  further  complication  would  be  to  test  all 
possible  combinations  of  several  fertilizers  at  different  levels.  The  triangle  sys- 
tem suggested  by  Schreiner  and  Skinner  (1918)  may  be  useful  for  the  computation  of 
all  possible  combinations  of  three  fertilizer  elements  (say  P2O5J  NH3;  an&  KgO)  at 
several  levels.  This  triangle  system  should  not  be  used  as  a  basis  lor  the  field 
lay-out  as  originally  advocated.  Such  a  test  should  be  designed  as  a  factorial  ex- 
periment.  (See  Chapter  19) .   (5)  Long -Time  Effect  of  Fertilizers:  Such  tests  with 
various  forms  of  fertilizers  are  concerned  with  the  physical  and  chemical  properties 
of  soils  as  well  as  soil  productivity. 

(b)  Design  of  Fertilizer  Experiments 

Several  basic  principles  should  be  considered  in  soil  fertility  experiments. 
A  soil  profile  to  a  depth  of  3  feet  is  highly  desirable  for  each  series  of  plots. 
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Before  soil  treatment  experiments  are  "begun,  the  American  Society  of  Agronomy  (1953) 
re commends  that  "representative  samples  of  the  soil  and  subsoil  should  b )  carefully 
taken  for  such  analyses  as  may  he  desired  for  future  reference".   In  the  matter  of 
plot  design  the  Society  cautions  that  "the  lateral  translocation  of  soil  or  ferti- 
lizer beyond  the  plot  interspaces  of  soil  experiments  should,  be  avoided  . 

Since  the  manner  of  fertilizer  application  may  .affect  yields  materially,  due  consid- 
eration should  be  given  to  this  problem". 

For  fertilizer  tests  wh.;re  2  or  more  fertilizers  are  applied,  at  2   or  mere 
levels,  the  factorial  design  is  suitable.  The  factorial  experiment ,  explained  by 
Fisher  (193*0  <s  Yates  (1933)*  Summerby  (1937)  and  others,  involves  all  combinations 
of  the  fertilizers  and  levels  (or  amounts)  of  application.  The  study  of  interactions 
is  an  important  consideration  in  such  an  experiment.  For  example,  suppose  a  ferti- 
lizer test  is  to  be  conducted  with  nitrogen,,  phosphorus,  and  potassium  at  two  differ- 
ent rates  each.  The  rates  can  be  designated  by  subscripts  so  as  to  give  the  8  possi- 
ble treatment  variants  as  follows: 

I0PaEC0,  %?0E0J  NqP^  KoP0%,  %?iK0,  %?,.%,  h,?^  arid  1^%. 

Siren  an  experiment  can  be  planned  for  a  randomized  block  test  o:c   for  some  form  of 
the  Incomplete  block  test.  G-oulden  (193*0  gives  some  suggestions  or.  the  design  of 
mere  complicated  fertilizer  experiments.  Residual  effects  duo  to  past  fertility 
treatments  is  discussed  by  Forester  (1937). 

XIV ,  Pasture  Experiments 

In  experimental  pasture  work,  the  investigator  may  desire  to:   (I)  determine  the 
amount  of  herbage  produced  on  an  area  by  different  pasture -grass  mixtures,  (2)  to 
find  out  the  influence  of  fertilizers  on  pastures  as  to  yield  and.  survival  of  the 
palatable  species,  or  (>)  he  may  desire  to  measure  the  influence  of  different  grazing 
methods  on  yield  and  survival.  Replication  of  treatments  is  vital  in  any  case. 

One  of  the  important  technique  problems  is  the  comparative  results  from  grazing  and 
mechanical  harvest  of  herbage.  Stariedon  (iQpl)  states  that  the.  animal  is  the  master 
factor  in  pasture  studies.  He  tethered  sheep  on  small  plots  and  moved  thorn  twice  a 
day  in  the  Aberwystwyth  pasture  researches.  Certain  advantages  are  claimed  for  his 
tethering  method:   (l)  replicated  plots  are  possible;  (2)  the  experimental  sheep  are 
handled  and.  examined-  twice  per  day;  (3)  grazing  will  be  uniform,  ana  (k)   the  animal 
capacity  is  increased  per  unit  area.  Schuster  (1929)  recommends  at  least  h   replica- 
tions and  3  animals  per  plot  in  pasture  investigations.  The  use  of  grazing  method.s 
permits  the  effects  of  trampling  on  the  vegetation  te  be  measured.  Pasture  plots 
maj  be  harvested  mechanically,  i.e.,  clipped,  with  a  mower  or  with  shears..   Brown 
(1929)  advocates  the  us^  of  grass  shears  for  small  cages, .the  lawn  mower  for  grass 
less  than  6  inches  high,  and  a  mowing  machine  for  tailor  herbage.  Several  studies 
have  compared  grazing  and  mechanical  harvest  of  pasture  plots..  Brown  (192-9)  found 
that  the  herbage  of  grazed  and  mowed,  plots  varied  markedly  in  time  due  to  animal  pre- 
ferences. Animals  void  a  large  proportion  of  the  fertilizer  elements  consumed  in 
feeds,  particularly  nitrogen  and  phosphorus.  Thus,  mowed  pastures  may  be  low  in  fer- 
tilizer elements  when  compared  with  grazed  pastures.  A  high  correlation  between 
mowed  and  grazed  yields  was  found  when  the  mowed  areas  were  changed  te  the  previously 
grazed,  areas  every  two  or  three  years.  Continuously  clipped  cages  have  yielded  less 
than  annually  mowed  cages.  Robinson,  et  ai  (1937)  ^rund.  a  progressive  decrease  in 
the  yields  of  clipped  permanent  quadrats  in  relation  to  grazed  areas. 

Sampling  methods  are  often  involved  in  the  design  of  pasture  experiments.  See  Chap- 
ter 16  for  further  details  en  this  abase,  as  well  as  the  report  of  v'inall  and  others 
(193V). 
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C   --  Incomplete  Experimental  Recorded 

XV.  Missing  Values  in  Experiments 

In  general,  replicated  field  experiments  are  so  arranged  that  the  mean  yield  for  all 
plots  that  receive  a  given  treatment  provides  the  "best  estimate  of  the  effects  of 
that  treatment.  Sometimes  the  yields  of  some  plots  are  lost  or  prove  unreliable  with 
the  result  that  the  orthogonality  of  the  original  design  disappears.  Since  th.4  treat- 
ment, "block,  etc.,  effects  are  computed  from  the  total  yield  of  all  plots  in  a  given 
treatment,  "block,  etc.,  it  is  necessary  to  interpolate  the  yield  of  the  missing  plot 
in  order  to  use  the  ordinary  analysis  of  variance. 

Allan  and  Wishart  (1930)  were  the  first  to  provide  formulae  for  the  estimation  of 
the  yield  for  a  single  missing  plot  in  either  randomized  "block  or  Latin  square  tests. 
They  arrived  at  their  formula  by  the  procedure  of  fitting  constants  by  least  squares. 
Yates  (1933)  used  a  simpler  solution  by  minimizing  the  error  variance  obtained  when 
unknowns  are  substituted,  for  the  missing  yields.  The  two  formulae  give  the  same 
results,  but  the  one  by  Yates  also  provided  a  method  appropriate  for  the  estimation 
of  the  yields  of  several  missing  values.  His  formula  is  used  here. 

XVI.  Calculation  of  Single  Missing  Value 

A  single  missing  value  can  be  calculated  for  either  a  randomized  block  or  latin 
square  test. 

(a)  Randomized  Block  Test 

Some  data  are  given  on  the  effect  of  date  of  planting  on  the  yields  of  sugar 
beets  in  which  a  plot  value  is  missing.  The  yields  are  in  tons  per  acre. 

Date  Block  Number 

Planted  1      2       3      k  3  Total 3 


Sarly 

22.3 

21.8 

19.7 

21.2 

Medium 

13.3 

iQ.k 

18.5 

21.5 

Liat  e 

17.2 

17.2 

17-9 

(18.8) 

Very  Late 

14.9 

12.6 

13.1 

Ik.k 

20.0  105.0 

17  •  3         9^ • 0 

16.7  (87.8)    69.0 

12.1+         67.U 


Totals  72.7    70.0    69.2   (75.9)   66. k  (35^.2)   335. 4 

; 

It  is  assumed  that  the  yield  of  the  late-planted  plot  in  block  k  is  missing.  The 
sums  for  block  k   and.  for  late  planting  are  given  below  or  to  the  right  of  the  appro- 
priate block  or  treatment  to  show  that  they  are  the  sums  of  only  the  known  plots . 
The  values  in  brackets  are  filled  in  later. 

-  The  formula  for  the  estimation  of  yield  of  this  value  in  a  randomized  block  test  is 
as  follows: 

x  =  mM  +  m'M'-  Tx  -  -.*-----.-  r  ---------------  r  -  -  (l) 

(m  -  1)  (_•  -  1) 

where  x  =  yield  of  missing  plot, 

m  a  number  of  treatments 

m'=  number  of  blocks  _^ 

M  =  sum  of  known  yield.s  of  treatment  with  missing  plot 

M'=  sum  of  known  yields  of  block  with  missing  plot 

Tx=  total  yield  of  known  plots. 


vThis  portion  is  taken  entirely  from  an.  outline  prepared  by  Dr.  F.  R.  Immer,  U.  of 
Minnesota. 
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Ip  the  sugar  beet  test  used  as  an  example, 

x  B  H&9-Q)  ±  %(2h±}    ~  335. ^  =  18.8 
(i  -  Dl5  -  1) 

The  yield,  x  -   18.8,  is  inserted  in  the  table  after  which  the  block,  treatment,  and 
general  sum  are  corrected  accordingly.  These  figures  are  in  the  brackets.  The 
analysis  of  variance  will  be  computed  in  the  usual  way,  except  that  the  degrees  of 
freedom  for  error  and  total  have  been  reduced  one.  The  degrees  of  freedom  must  be 
reduced  by  one  for  each  plot  value  interpolated.  The  analysis  of  variance  is  as 
follows: 

Variation    Degrees    Sums       Mean       Standard  F -Value 

due  to      Freedom    Squares     Square     Error  (s)   Observed    5  pet.  point 

Blocks         k  13.043      3.2608  3.71         3.36 

Treatments      3      lU9 . 6l8     Ho. 8727  56.69         3-59 

Error        11       9-677     0.8797    0.9379 


Total  18  I72.338 

(b)  Latin  Squere  Test 

The  formula  to  be  used  for  the  interpolation  of  a  single  value  in  a  latin 
square  test  is  as  follows: 

x  =  m(Mr  +  Mc  +  M+.)  -  2  Tx  (2) 

'  (m-1)  (m-2) 

Where  x  =  mis sing  plot  yield; 

Mr,  Mc,  Mj.  =  totals  of  known  yield?  of  the  row,  column,  and  treatment 

from  which  the  plot  is  missing; 

m  -  number  of  treatments  (also  equals  number  rows  or  columns); 

Tv  =  total  yield  of  ail  known  plots, 

XVII .  More  than  One  Missing  Value 

A  method  of  approximation  may  be  used,  for  irore  than  one  missing  plot  yield.   Three 
plots  are  missing  in  the  randomized  block  trial  given  below: 


Pate 
Planted 

i_ 

p 

-7. 
0 

h 

3 

Total 

Early 
Medium 

Late 
Very  Lat 

e 

22.3 

10.3 
17,2 

14.9 

21.8 

(18.6) 

17.2 

12  .  6 

(21,2) 
18,5 

17.9 
13.1 

0  T  0 

21.3 
(18.7) 

20 .  D 
17.3 
16.7 

12.4 

(106.5) 

(  9]+.2) 
(  87.7) 
67.  4 

83.5 
73.6 
69.O 

Totals        72.7     (7C2)     (70.7)   (73-8)     66.4        (355.8)        207 

51.6      i+9.5    57.I 


The  plot  yields  given  in  brackets  have  been  assumed  to  be  missing.  As  it  is  possible 
to  interpolate  the  yield,  of  only  one  plot  at  a  bime,  one  must  assume  yields  for  all 
missing  plots  except  the  one  to  be  interpolated.  First,  suppose  the  medium  planting 
111  block  2  is  interpolated.  Per  the  early  plot  in  block  3  and  the  late  plot  in  block 
4,  one  must  insert  the  mean  yield  of  the  known  plots  for  those  two  dates,  or  21.3  anl 
17.2  t on s ,  r e  s p e c t  i v e 1 y . 
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The  formula  for  interpolation,  in  which  have  been  substituted  the  values  for  the 
first  approximation  for  the  yield  of  the  medium  planting  in  "block  2,  is  as  follows: 

x  =  mM  +  m'M'-Tx   =  M75<6)  ±  'JL2L&1  '"   335-8  =  18.7 
(m  -  l)(m'-l)      (k   -  1)  (5  -  1) 

The  same  procedure  can  "be  followed  for  the  early  planting  in  "block  3>   except  that  the 
guess  of  21.3  used  "before  should  "be  removed.  The  interpolated  value  (18.7)  is  used 
for  the  medium  planting  in  block  2,  and  the  guessed  value  (17.2)  for  the  late  plant- 
ing in  "block  h.     The  grand  total  is  corrected  accordingly.  The  value  of  x  in  this 
case  is  21. 3. 

In  like  manner  the  yield  of  the  late  planting  in  "block  k   is  interpolated.  This  is 
found  to  "be  18.7. 

Since  it  was  necessary  to  estimate  the  yields  of  two  plots  in  order  to  start  the 
interpolation  process,  the  values  obtained  will  be  somewhat  in  error.  Therefore,  the 
values  are  re -interpolated,  using  the  values  obtained  by  the  first  interpolation  for 
all  but  the  plot  yield  being  calculated.  This  is  repeated  until  no  further  changes 
take  place.  The  values  obtained  in  this  case  were  as  follows: 

'  .  Approximations 

Treatment  Block  1st     2nd     3rd 


Medium                  2  18.7     18.6    .3.8.6 

Early                    3  21. 3     21.2     21.2 

Late                    k  18.7     18.7    .18.7 

The  interpolated  values  did  not  change  after  the  second  approximation. 

The  interpolated  yields  are  inserted  in  the  above  table  (as  shown  in  brackets)  after 

which  the  correct  treatment  and  block  totals  are  determined.  The  analysis  of  var- 
iance is  then  computed  as  shown  below: 


Variat  ion 
due  to 

Degrees 
Freedom 

Suras 
Squares 

Mean 

Square 

Standard 
Error  (s) 

F -Value 
Obtained     %  Point 

Blocks 

Treatments 

Error 

k 
3 
9 

11.923 

I6O.306 

8.309 

2.9808 

53.  ^353 
0.9232 

O.9608 

3.23 

57.88 

3.63 
3.86 

Totals 

16 

180.538 

Three  degrees  of  freedom  have  been  subtracted  from  error  and  from  total  because  three 
plot  yields  were  interpolated. 

XVIII. 'Tests  of  Significance 

The  error  calculated  from  analyses  of  variance,  in  which  one  or  more  plot  values  have 
been  interpolated,  is  a  valid  estimate  of  experimental  error  wnen  the  degrees  of 
freedom  have  been  reduced  by  one  for  each  value  interpolated.  However, •  the  variance 
due  to  treatments  is  not  entirely  without  bias,  being  always  higher  than  it  should 
be.  The  significance  of  the  test  is  accentuated,  but  the  correction  Tor  this  condi- 
tion is  quite  trivial  for  cases  in  which  only  a  single  value  is  missing.  The  bias 
is  more  pronounced  where  many  plots  are  missing. 

Tests  of  significance  by  means  of  the  analysis  of  variance  are  generally  all  bhat  are 
required.  For  a  single  missing  plot,  the  treatment  mean  with  the  estimated  value  of 
the  missing  plot  will  have  an  error  as  follows  for  a  randomized  block  test: 
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°xt  =  1 
m' 


m 

(m  -■  l)(m'  -  1) 


(3) 


Where  m'  =  number  of  blocks,  m  =  number  of  treatments,  and  s^  =  variance  of  a  single 
plot  calculated  from  error.  The  variance  of  the  treatment  mean  would  he  s<-/m'  where 
no  plot  was  missing. 

For  a  single  missing  plot  in  a  latin  square,  the  variance  of  the  treatment  mean  with 
the  missinp:  value  would  he  as  follows: 


m 


1  +     m |   o2  ------------  -----  -  (h) 

(m  -  l)(m  -  2)   J 


For  rnore  then  one  missing  plot,  these  formulae  are  strictly  applicable  only  to  com- 
parisons between  means  where  one  contains  no  missing  plot.  To  find  the  variance  of 
the  difference  between  two  means,  both  of  which  contain  missing  values,  is  rather 
difficult. 
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Questions  for  Discussion 

1.  What  criticisms  have  been  made  of  agronomic  experiments  in  general?  Are  they 
justified? 

2.  What  justification,  is  there  for  the  antagonism  sometimes  found  between  scientific 
theory  and  practical  facts? 

3.  What  are  the  principal  objectives  in  agricultural  experiments? 

k.   What  factors  should  be  considered  in  the  outline  of  an  experiment? 

5.  What  is  the  principle  of  the  extremes?  Illustrate. 

6.  In  laying  out  field  experiments  in  which  one  variable  is  continuous,  what  prin- 
ciple or  rule  should  be  followed  with  respect  to  the  extremes? 

7.  Distinguish  between  preliminary  and  permanent  experiments. 

8.  What  is  a  simple  experiment?  Its  limitations?  Advantages? 

9.  What  are  combination  or  complex  experiments?  .Are  they  desirable?  Why? 

10.  What  sources  of  variation  or  error,  other  than  that  due  to  soil  or  season,  may 
occur  in  field  experiments? 

11.  How  does  a  random  arrangement  differ  from  a  systematic  arrangement? 

12.  Is  soil  heterogeneity  systematic  or  random?  Explain. 

13.  Upon  what  is  the  estimate  of  error  based?  How  influenced  by  a  systematic  plot 
arrangement  ? 

Ik.   What  are  the  advantages  usually  given  for  systematic  arrangement?  Random 
arrangement  ? 
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20 
21 
22 


15.  What  is  meant  "by  the  "knight's  move"? 

16.  Discuss  the  relative  efficiency  of  systematic  and  random  plot  arrangements. 

17.  What  is  a  randomized  block  test?  What  restrictions  are  imposed?  Its  limitations? 

18.  What  is  the  Latin  square  arrangement  of  plots?  What  is  the  primary  objective  in 
this  arrangement? 

19.  What  conditions  should  be  observed  in  planning  variety  tests?  What  is  a  check 
variety? 

What  precautions  are  necessary  in  crop  rotation  tests? 
What  are  the  limitations  in  fertilizer  tests? 
In  rotation  and  soil  treatment  tests  what  should  be  the  treatment  of  the  check? 

23.  What  is  the  law  of  the  minimum?  Its  application  to  fertilizer  tests? 

24.  What  is  a  factorial  experiment?  Give  an  example. 

25.  Discuss  grazing  vs .  mechanical  harvest  of  herbage, 

26.  Why  is  it  necessary  to  calculate  missing  values  for  the  analysis  of  variance  to 
apply?  .  ., 

27.  How  are  the  degrees  of  freedom  modified  when  a  missing;  value  is  computed? 


Problems 

1.  Different  amounts  of  fertilizer  were-:  applied  to  sugar  beets  by  the  Colorado  Experi- 
ment Station  in  193°  (Data  from  D.  W.  Robertson)   in  a  randomized  block  trial.     The 
yields  in  pounds  of  sugar  per  plot  for  various  amounts  of  treble  superphosphate 
applied  per  acre  were  as  follows: 


Phosphate 
Treatment 

None 
100  lbs. 
200  lbs. 
300  lbs. 


Totals 


Block 

I 

II 

343 

I85. 

358 

413 

393 

.435 

427   ' 

468 

III 


208 

483 

463 
487 


Total 

730 
1256 
1291 
1382 

1321 


150" 


10 


4l 


£<*; 


466 


(a)  Compute  the  analysis  of  variance  for  a  randomized  block  experiment 

(b)  Determine   significance  by  use   of  the    "F"  test. 

(c)  Compare  the  average  .yields 


the  no  treatment   and  200  lb.    treatment   by 


means   of  the   standard  error. 


A  rate  of  planting  test  with  sugar  beets  was  conducted  in  1931  by  H.  E.  Brewbaker, 
The  rates  used  were:      15,    20,   23,    and  30  lbs.   per  acre.     The   experiment  was  de- 
signed as  a  4  by  4  Latin  square,    the   data  for  which  follow: 


Tons  beets   per 


(3)  16.73 

(4)  17.74 

(1)  17.52 

(2)  18.21 


■  ore 

~liTT 

(2)  17.2c 


38 


(3) 
00 


'I  o 


,13 
•53 


Column 
Totals   70.20 


70.26 


Row  totals 


(]0 

10.35 

(2) 

15.27 

63.73 

(3) 

18.83 

(1) 

16.94 

70 .  71 

(2) 

17.97 

(*0 

18.31 

71.95 

(1) 

17-71!- 

(3) 

16.61 

72.09 

70.89 


67.13  278.48 


(a)  Compute  the  analysis  of  variance. 

(b)  Obtain  "F"  for  a  comparison  of  error  with  rows,  columns,  and  treatments. 

(c)  Test  the  significance  of  'F': . 

(d)  Continue  the  analysis  and  compute  the  standard  error  of  the  mean,  standard 
error  of  a  difference,  and  level  of  significance  in  case  it  Is  justified. 
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3-  Design  a  crop  rotation  experiment  to  show  the  effects  of  a  legume  in  a  rotation. 
The  rotations  are  as  follows:  (a)  fcarley  (seeded  to  alfalfa),  alfalfa,  alfalfa, 
corn,  and  sugar  heets;  and  (t>)  "barley,  corn,  and  sugar  beets. 

k.   Four  varieties  of  wheat  were  grown  in  I93O  in  a  randomized  block  trial  in  5  blocks 
The  yield  of  one  plot  was  lost,   (a)  Calculate  the.  yield  of  the  missing  plot, 
'b)  Complete  the  analysis  of  variance. 


Block 


Variety 


Kanred 

5k.h 

in  .7 

52.1 

56.1 

61.0 

Cheyenne 

ko.7 



1^6.5 

59-9 

53-7 

Tenmarq. 

61.7 

51.7 

^3-5 

61.9 

58.7 

Hays  No.  2 

55.5 

50.6 

61.9 

1*5.1 

72.  k 

5.  The  same  k  varieties  were  grown  in  a  randomized  "block  test  in  1937.  The  records 
on  2  plots  were  lost.  Calculate  the  missing  values  and  complete  the  analysis  of 
variance . 


Block 


Variety 


5 


Kanred 

5^.6 

53.7 

68.0 

55.2 

58.5 

62.1 

Cheyenne 

66.3 

60.9 

6^.8 

67.6 

__-._ 

66.2 

Tenmarq. 

58.5 

57-5 

kk.i 

65.6 

52.9 

51.6 

Hays  No.  2 

57.3 



60.5 

62.2 

58.8 

5^-3 

CHAPTER  XVI 
QUADRAT  AED  OTHER  SAMPLING  METHODS 

!•  Sampling;  in  Agronomic  Work 

There*  are  times  when  it  is  impractical  to  lise  the  whole  plot  or  plant  population  to 
obtain  a  numerical  determination  of  some  characteristic  of  the  experimental  material. 
In  such  cases  as  tiller  number,  yield?  percentage  dry  matter,  nitrogen  or  sugar  in 
the  crop,  it  is  mere  practical  to  sample  only  a  proportion  of  the  whole.  To  quote 
Wishart  and  Sanders  (1955) :   "The  object  is  to  obtain  as  close  an  estimate  as  we  can 
of  the  measure,  which  would,  be  obtained  accurately,  within  the  limits  of  experimental 
error,  had  the  produce  of  the  whole  plot  been  counted,  weighed,  or  analyzed."  The 
sample  must  be  representative  and  taken  in  such  a  manner  as  to  assure  that  end.   It 
is  also  necessary  to  take  into  account  the  further  source  of  error  due  to  the  sampling- 
process.  Yields  determined  by  sampling  procedure  arc  not  determined  as  accurately  as 
when  the  entire  plot  is  taken,  but  it  is  often  advantageous  to  sacrifice  some  accur- 
acy to  save  labor, 

II.  Theory  of  Sampling 

The  sampling  distributions  so  far  considered  have  been  based  on  the  assumption  of 
independence.  The  simple  theory  of  errors  does  not  apply  when  the  variation  is  heter- 
ogenous and  the  extent  to  which  the  sources  of  variation  are  represented  is  not  left 
to  chance.   It  has  been  shown  in  a  randomized  block  trial  that  the  variance  due  to 
error  is  an  unbiased  estimate  of  the  error  variance  of  the  infinite  population  from 
which  the  data  under  consideration  are  a  sample.  The  other  items  in  the  mean  square 
column  (blocks  and  varieties)  are  not  unbiased  estimates  of  the  respective  variances 
of  the  population.   In  fact,  they  contain  the  variance  due  to  error  as  the  degrees 
of  freedom  become  indefinitely  largo.  For  example;,  the  estimated  variance  due  to 
varieties,  in  the  theory  of  large  samples,  is  made  up  of  the  true  variance  due  to 
varieties  plus  the  variance  due  to  error.  This  becomes  important  in  statistics  of 
estimation  as  shown  by  Tippett  (1957),  Immor  (1932,  1956),  and  others. 

Suppose  some  data  on  protein  in  relation  to  different  rate -of -planting  treatments  in 
corn  for  1931  be  used  to  Illustrate  the  computations; 


Method 

Kate 

Prote: 

.n  Per  cent  per 

Sample  v/ 

Planted 

Planted 

B3 

.ock  I 

Block  11 

Block  III 

Variety 

(i) 

(2) 

(1)              (2) 

(1) 

(2) 

Totals 

Golden  Glow 

Hills 

o 

10.357' 

10.4o8 

10A25    10. 522 

10 . 1  00 

10.043 

oj.  .  op_; 

:l                      it 

a 

9-525 

9.422 

9.228        9.3^2 

9.667 

9.543 

11                      ii 

•s 

8.995 

3.903 

9'.  325      9.211 

v<  .  -vd.    . 

9.479 

35^15 

Pride  North 

3 

IO.363 

10.351 

9.713      9-553 

9.627 

59.212 

n            ii 

4 

0.171 

9-2^5 

9.399      9.576 

9.057 

9.052 

<".<=;   -\r\  e 

:l                           |1 

5 

9 .  loo 

9 .  120 

9.171       9.211 

8.527 

8.504 

53.690 

Golden  Glow 

Drills 

12^ 

IO.072 

ic. 038 

10.528    io.4o8 

IO.38O 

10.438 

61.714 

■M             it 

9 

9.750 

9  •"'-/? 

9.696      9.559 

9-^33 

9.4.51 

57.474- 

ii             ii 

o 

8.8^6 

3.778 

a.  old        3.oo4 

9.143 

9.080 

93.512 

ii             H 

3 

8.482 

8 .  590 

9Jl79        9.4-22 

go1^ 

9.002 

55.02p 

Pride  Worth 

1  O 

JL.w 

9.872 

9.929 

10.009      9.384 

9.827 

0.724 

l59.7^i  '■ 

H            H 

9 

9.325 

9A6B 

8.853      3.892 

9.H4 

0. 14° 

o4.8oo 

■I            ,i 

6 

O      '-vO-"i 

0.  ;Oy 

3.761 

9.523       9.365 

8.832 

3.802 

34.384 

.1                      ;l 

3 

8.64.1 

8.767 

9.260      9,428 

3.455 

8.510 

33.067 

Totals 

■ 

.31.323   3 

.31,459  1 

.33.236  133.237 

131.123 

131.o5q 

791.432 

vprotein  =  N  (nitrogen)  x  3«7   ^  Plants  per  hill   N^ Inches  between  plants  in  row 
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The  analysis  of  variance  is  set  forth  for  the  experiment  in  which  two  protein  deter- 
minations were  made  on  the  shelled  corn  per  plot.  For  simplicity,  treatments  will 
he  considered  without  regard  to  variety  or  method  of  planting. 

The  calculations  for  the  sums  of  squares  are  as  follows,  the  two  samples  per  plot 
being  added  together  for  the  plot  determinations: 

S(x)  =  791^52  (Sx)2/N  =  7,14.56.7216 

S(xs)2  -  (Sx)2/N  =  7,^80.8645  -  7,1+56.7216  =  24.11+27 

S(x  )2  -  (Sx)2/N  =  14, 961.1+1+26/2  -  7,456.7216  =  25.9997 
P 

S(xJ2  -  (Sx)2/n  =  208,799.0186/28  -  7,456.7216  =  0.5362 
S(xt)2  -  (SX)2/N  =  44,852.7957/6  -  7,456.721b  =  18.7444 

The  summary  for  the  analysis  of  variance  is  as  follows: 


Variation 
due  to 

Degrees 
Freedom 

Sums 
Squares 

Mean 

Square 

Standard 
Error 

F -Value 

Blocks 

Treatments 

Error 

2 

15 
26 

0.5362  . 
13.7444 
4.8691 

0.1931 

1.4419 
O.1875 

0.1+528 

1.05 
7/7O** 

Total  for  Plots 

41 

25.9997 

Samples  within 

Plots 

"~Ts 

0.1430 

0.0054 

O.O583 

Total  samples 

85 

24.1427 

In  the  simple  case  where  one  sample  is  drawn  from  each  plot  with  the  treatment  repli- 
cated for  m  plots,  the  variance  of  a  treatment  mean  is  V-^/m,  where  V 2,  the  mean 
variance  between  plots  approaches  a  2,  the  true  variance  of  an  individual  plot  as  m 
approaches  infinity.  However,  when  n  samples  are  drawn  from  each  plot,  the  variance 
of  a  treatment  mean  is  Vp2/mn,  where  Vp2/n  estimates  a2  the  true  variance  of  an 
individual  plot  plus  the  true  variance  of  an  individual  plot  moan  or  as  /n.  This 
follows  because  a  plot  mean  is  now  subject  to  variation  due  to  more  than  one  sample. 
It  is  evident  that  o-a2  is  the  true  variance  of  an  individual  sample  taken  from  a 
plot.  The  relationship  may  be  shown  as  follows: 


Yc 


cr 


+  a 


ran 


m 


s 

mn 


_  i 


m 


(a 


n 


(1) 


It  should  also  be  noted  that 


-*-  cr 


(2) 


It  is  clear  that  ex2  can  be  estimated  from  the  above  formula  because  V2  and  v|  are 
both  obtainable  from  the  analysis  of  variance. 

In  the  present  experiment, 

V2  =  O.I875,  V2  =  0.0054,  n  =  2,  and  m  =  5. 
P  s 


Mil 


+  JL  (o-2 


vl 


+    cr^  )  ,  and  0.0054 

n 


cr 
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Therefore , 

0,1873       v  1  ( cr ^         +       0 oXlp> )     or  0 . 0919  _  __ .    cr 2  . 

6  5  2  '  ■  p 

The  standard  errors  for  plot  and  sample  means  are  then  calculated. 
V 0.091 9    =    0.303s   ..-..-.>  or,,  and /:>.003J;     =    0.0583 >  cts. 

The  ratio,    cc/crg  is  estimated  as  0. 3032/0,0383  =     3-20.     This   indicates  that  the 
variation  between  plots  greatly  exceeds   that  within  plots  or  between  samples,   being 
5.20  times  as  great. 

Ill,   E c onomy  i n _ S amp  1  ing 

It  Is  of  considerable  importance  to  analyze  hew  the  precision  of  ar,  experiment,  as 

measured  inversely .  by  i/m  (cr'f:  +  c  /n)  1"  affected  by  varying  m,  the  actual  plot 

p     s 
replications  in  the  field,  ana  n>  the  number  of  sample;:;  drawn  from  a  plot.  The  rnosx. 

important  inference  to  be  drawn  is  that  the  precision  is  mainly  controlled  'by  m,  the 
number  of  plot  replications.  Increasing  the  number  of  samples  taken  from  the  differ- 
ent plots  can  only  appreciable  affect  the  precision  when  o§  is  not  relatively  small 

v        j.  i-  j.  .to  o 

as  compared  with  cr-  .   In  the  orosent  problem  0.0034,  the  estimated  value  of  o~'~    is 

"0  6  to   ■   ■ 

small  compared  with  0.0919s  the  estimated  value  of  o  ": .  Hence,  It  must  be  concluded 

'  •  j  o 

that  to  make  more  than  one  analysis  on  a.  sample  from  a  plot  was  unwarranted  by  the 

small  gain  that  would  result, 

( a )   C  oiiiput at  i on  of  Kuaib e r  of  Sampl e §_  or  Eeplicate a 

Thw  required  variance  of  the  mean  of  a  treatment   (K)  would  be: 

K     =     1  (cr2      f         0-  2    ) '  -  - ■-   -   (3) 

—  n  0  ' 

n 

For  the  data  in  this  problem, 

K  =  1_      (0.0919  +  0.003^  )   -*0.0312 
"3  2  '" 

The  computation,  for  different  values  of  m  or  n,  will  give  the  number  of  replications 
and  number  of  samples  per  plot  that  will  be  necessary  to  reduce  the  variance  of  the 
mean  to  a  given  level,  i.e.,  K  -   0.03X2.  The  X  values  for  the  estimation  of  the 
variance  for  treatments  are  as  follows  when  the  number  of  analyses  per  sample  are 
varied  for  three  replications: 

Humber  of  Samples  (n)  ^ril:i-lill!!_ll0ilJr!'  z.  a. 

1  O.0318 

2  0.0312 

3  0.0310 

k  '0.0309 

3  O.0307 

These  data  indicate  the  negligible  effect  when  1,  2,  3;  ^?  or  5  protein  analyses  are 
laade  from  shelled  corn  camples. 

( "a )  Determination  of  M  in  imum  Exp  ens  e 

Technical  difficulties  often  prevent  plot  replication  beyond  a  certain  degree 
In  such  cases  it  Is  frequently  worthwhile  to  strengthen  the  precision  of  the  experi- 
ment by  drawing  replicate  samples  from  the  different  plots.  The  number  that  should 
be  drawn  depends  on  several  factors.  These  factors  are:   (l)  The  variation  between 
plots  as  measured  by  oi-  in  relation  to  o~|,  and  (2)  the  cost  of  growing  a  plot  ar 
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compared  with  the  cost  of  obtaining  and  analyzing  replicate  samples  per  plot.  The 
time  factor  instead  of  the  cost  factor,  or  the  combination  of  the  two,  should  bo  con- 
sidered in  many  types  of  experiments. 

It  is  proposed  to  investigate  how  these  relative  costs  determine  a  balance  between 
plot  replicates  (m)  and.  sample  replicates  (n)  in  order  that  a  stated  precision  for 
an  experiment  may  be  obtained  at  a  minimum  expense.  Let  C  represent  the  cost  per 
plot  replicated,  and  c  the  cost  per  sample  replicate  in  the  conduct  of  an  experiment. 
For  a  given  treatment  the  total  cost  of  plot  replications  will  be  mC,  while  the  total 
cost  of  sample  replications  will  be  mnc.  Hence,  E^  the  total  expense  per  treatment, 
is  given  by: 


E 


mC 


mnc 


w 


A  pertain  criterion  of  precision  to  be  obtained  may  be  represented  by  K  =  l/m  (cr^  + 
cri/n),  where  K  =  the  required  variance  of  the  mean  for  a  treatment.  In  order  to^re- 


duce  (4)  to  an  equation  with  one  variable,  it  is  found  that  m  =  l/K  (ct2 
a  value  which  is  substituted. 


a' 


/n), 


Then, 


E     =     l/K       (cr2 


a  2/n)(C  +  nc) 


To  reduce  the  total  cost  to  a  minimum,   differentiate  E  with  respect  to  n,   and  set  the 
equation  to  equal  zero,   viz., 


dE 
dn 


K 


(a2     +     o-a2/n)c     +     (C  +  nc)(-n"2o~s  2) 


=     0 


co~2     + 

P 


2 
c  o~  s 

n 


-     C  cr 


ns 


-    COrS    =       0 


n 


n2ca|   =     Co"2 


n2  =  G  o-2 


C  cr 


or 


n  = 


C    cr 


c  a 


Thus,   the  total  cost  will  be  a  minimum  when, 
r£=  C  a2 


(?) 


c  cr 


In  this  case,  n,  and  hence  m,  are  determined  to  afford  a  most  economical  design.  It 
is  worthwhile  to  note  that  n  is  determined  to  be  independent  of  K,  the  precision 
desired. 

In  the  present  experiment,  the  values  are  substituted  in  the  above  equation  (5)j>  viz 
n  =  2,  o-2  =  0.0919,  and  0%   =  0.003^.  The  ratio  of  costs  will  be: 

C  =   0.0919  x  22  =  108.12 
c      0.003*+ 

From  the  standpoint  of  expense,  the  analysis  of  a  duplicate  sample  from  each  plot 
would  have  been  justifiable  to  produce  the  most  economical  design  only  if  the  cost 
per  additional  plot  had  been  108  times  the  cost  per  analysis. 
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IV.  Sampling,  Practices 

The  important  practical  consideration  in  campling  is  that  sufficient  units  should 
he  taken  to  give  a  reasonably  accurate  representation  of  the  whole i  In  sampling 
processes  a  small  representative  amount  of  the  material  io  analyzed.  For  field  plots. 
Wishart  and  Sanders  (1955)  advise  that  the  samples  should  amount  to")  per  cent  of 
the  plot  at  the  very  least.  It  must  be  recognized  that  the  use  of  the  entire  plot 
is  the  most  reliable  where  it  is  feasible,  as  sampling  can  afford  only  an  estimate 
of  the  plot  yield.  Yates  (1935)  found  yl  per  cent  loss  of  information  as  an  average  • 
of  several  experiments  where  sampling  technic  was  employed. 

Some  of  the  practices  for  different  crop  material  are  discussed  be. low.  To  determine 
sampling  errors  it  is  necessary  to  draw  at  least  two  independent  samples . 

(a)  Quadrat  Methods 

Some  form  of  quadrat  is  generally  used  for  sampling  yield  trials,  or  for  the 
detailed  study  of  vegetation.   It  was  early  pointed  out  by  McCall  (19-1?)  that,  while 
the  harvest  of  the  entire  plot  is  most  satisfactory  in  yield  trials,  it  is  attended 
with  difficulties  that  make  it  practically  impossible  for  plots  away  from  the  main 
station.  A  form  of  quadrat  was  suggested  as  a.  solution.  The  quadrat  may  be  linear 
for  a  certain  length  of  row,  often  being  unite  of  one  foot,  one  yard,  or  one  rod  In 
length.  The  type  of  quadrat  most  frequently  used  in  range  and  pasture  experiments 
is  a  square  area,  usually  a  square  motor  or  yard.  The  different  kinds  of  area  quad- 
rats are  described  oj  Weaver  and  Clements  (1929)  • 

1-  Small  Grains :  The  rod-row  unit  is  widely  used  by  American  investigators  to  har- 
vest small ' grain  plots  by  sampling  methods.  The  English  workers  prefer  one -foot  or 
one-meter  lengths  of  drill  row. 

The  use  of  the  rod-row,  to  secure  a  yield  estimate  for  the  entiro  plot,  was  studied 
by  Amy  and  G-arbor  (1919).  They  harvested  9,  5,  ana  k   rod-row  samples  from  l/lO- 
acre  plots  of  wheat  and  oats,  and  subsequently  the  entire  plot.  They  concluded  that 
increases  over  the  mean  yield  of  the  checks  of  15.70  per  cent  for  triplicate  1/10- 
aere  plots,  9 -'+9  per  cent  for  the  nine  rod  rows,  12,73  Per  cent  for  the  five  rod 
rows,  and  14.M-!-  per  cent  for  the  four  rod  rows  were  (on- the  average)  probably  signi- 
ficant. Nine  rods  removed  from  l/lO-aqre  plots  were  concluded  to  give  practically  as 
accurate  yield  determinations  as  for  the  harvest  of  the  entire  plot,.   It  was  admitted 
that  the  amount  of  labor  required  to  remove  nine  rod  rows  was  about  the  same  as  for 
the  harvest  of  the  entire  plot.   In  studies  with  wheat  and  barley,  Clapham  (1929)  ob- 
tained a  standard  error  of  less  than  6  per  cent  for  the  yield  estimate  whore  30  one- 
meter-longth  drill  rows  were  harvested  from  l/jO-acre  plots.  The  red  row  method  was 
compared  with  meter  lengths  with  six  sets  of  five  contiguous  meter  lengths.   In  bar- 
ley, the  standard  error  for  30  met or -lengths  was  5.99  P^r  cent,  while  the  standard 
error  calculated  from  sets  (rod  rows)  was  7*20  per  cent. 

The  use  of  the  square  yard,  in  addition  to  the  rod  rev,  has  been  studied  by  various 
workers.  Kiesselbach  (I.917)  determined  the  yield  on  lk   entire  l/^O-acre  plots  com- 
pared to  20  areas  32x32  inches  (quadrat  areas).   It  was  concluded  that  20  systemati- 
cally distributed  areas  may  be  safely  substituted  for  the  yield  of  the  entire  plot. 
Amy  and  Sfeinmetz  (1919)  concluded  that  four  or  five  systematically  distributed 
square- yard  areas  removed,  from  l/lO-acre  plots  gave  approximately  the  same  error  for 
yield  as  harvesting  the  entire  plot. 

2.  Potatoes  and  Other  Hill  Crops:  Yields  of  potatoes  at  Eothamsted  were  analyzed 
both  by  sampling  and  by  harvesting  the  entire  plot.  There  were  ].80  plants  en  each 
plot.  Wishart  and  Clapham  (1Q29)  reasoned  that  the  individual  plant  was  the  logical 
unit  rather  than  a  'metrical  unit.  The  actual  number  of  plants  necessary  to  determine 
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the  plot  yield  depended  upon  the  uniformity  of  the  crop  and  the  plot  size,  hut  rare- 
ly was  less  than  10.  A  one-in-10  sample  was  inadequate  for  plots  l/90-acre  in  size, 
it  being  necessary  to  take  every  third  or  fourth  plant  on  a  plot  that  size  to  give  an 
error  of  four  per  cent.  It  was  concluded  that  there  was  little  to  gain  "by  the  use 
of  sampling  methods  on  plots  less  than  l/20-acre  in  size.  It  was  better  to  harvest 
the  entire  plot  for  yield.  A  pattern  method  has  been  used  to  sample  sugar  beet  plots 
at  harvest  time,  the  individual  beet  being  the  unit.  Wishart  and  Sanders  (1935)  make 
this  explanation:   "Suppose  that  there  are  200  beets  in  a  plot,  and  that  it  is  pro- 
posed to  take  two  sampling  anits  of  10  beets  each.  Two  numbers  between  1  and  20 
(inclusive)  are  drawn--say  k   and  lo.  The  plot  is  then  covered  by  walking  along  one 

row,  back  on  the  next,  and  so  on,  and  the  ifth,  24th,  kkth beet  pulled  for  one 

sampling  unit,  and  the  l6th,  56th,  56th beet  for  the  other  sampling  unit,  totals 

only  being  recorded  in  each  case."  Small  samples  of  10  roots  for  sugar  analysis  were 
criticized  by  Johnson  (1929) .  Five  samples  of  10  roots  from  l/^O-acre  plots  gave  a 
difference  as  high  as  2.2  per  cent  sugar.  He  concluded  the  results  were  unreliable 
within  one  per  cent  of  sugar  each  way.  For  50-beet  samples  taken  in  groups  of  10  he 
reduced  the  estimate  of  significance  to  0.53^  Per  cent  and  0.593  Ver   cent  sugar  in 
two  experiment  s . 

3.  Pastures:  The  area  quadrat  has  been  widely  used  in  pasture  investigations,  the 
usual  size  being  a  square  meter.  For  the  determination  of  the  abundance  and  frequency 
of  species  in  range  pastures,  Hanson  (193^)  concluded  that  a  quadrat  two-meters  in 
size  was  desirable.  Stewart  and  Hutchings  (195^)  have  suggested  the  Point -Observa- 
tion-Plot method  for  vegetation  surveys.  These  plots  are  100  square  feet  in  arca; 
being  marked  off  Idj   a  circle  5. 6k   feet  in  radius.  This  method  is  claimed  to  be  more 
rapid  than  the  ordinary  quadrat  method.  It  is  also  suitable  for  statistical  analysis. 
The  vertical  point  method  has  been  advocated  recently  by  Tinney  et  al.   (1937) •  Two 
horizontal  pipes  are  mounted  on  legs  1.2  inches  high,  with  a  linear  row  of  10  holes 
spaced  two  inches  apart  through  which  needles  Ik   inches  long  are  moved  up  and  down. 
The  point  is  pushed  down  until  it  touches  a  plant  or  a  bare  spot.  The  number  of  time's 
a  species  is  hit  per  100  readings  (needles)  is  expressed  directly  in  per  cent.  The 
needle  may  hit  a  number  of  different  species,  e.g.,  bluegrass  '32,  timothy  30,  redtop 
12,  red  clover  3>  snd  bare  space  8.  A  modification  is  the  inclined  point  method. 
Others  who  have  published  on  quadrats  for  pastures  are  Brown  (1937)*  an<i  Fobinson, 
et  al  (1937). 

(b)  Sampling  from  Bulk  Material 

So  far  sampling  has  dealt  with  plots  during  growth  and  harvest.  Sampling  of 
harvested  produce,  or  bulk  material,  is  another  important  form  found-  in  experimenta- 
tion. In  laboratory  determinations,  a  sample  of  the  material  is  mixed,  and  one  or 
two  sub-samples  taken.  A  good  method  is  to  mix  the  heap  of  material  thoroughly,  to 
divide  it  into  four  quarters  and  to  reject,  for  example,  the  N.S.  and  S .\>J .   quarters 
mixing  the  other  two  again.  The  process  is  repeated  until  the  bulk  is  reduced  to  the 
size  required  for  a  sample. 

1.  Protein  Determinations:  Duplicate  samples  were  taken  in  2*+l  cars  of  wheat  by 
Coleman,  et  al  (1926)  to  determine  the  accuracy  of  sampling  for  protein.  The  cars 
were  sampled  twice  in  5  different  areas  with  a  gram  probe.  The  contents  of  both 
samples  were  composited  separately  and  each  reduced  to  73  grams  in  size.  Over  96 
per  cent  of  the  tests  varied  less  than  0.25  per  cent  in  protein.   In  a  study  of  sam- 
ple size  the  error  was  found  to  be  less  than  0.10  per  cent  when  the  samples  weighed 
30  grains  or  more.  The  error  was  higher  for  smaller  samples.  Bartlett  and  Greenhill 
(I936)  and  Leonard  and  Clark  (1936)  found  that  one  protein  determination  per  repli- 
cate reduced  the  error  more  rapidly  when  the  number  of  replicates  was  increased  than 
by  the  use  of  duplicate  laboratory  analyses.  The  latter  workers  found  the  cost  ratio 
of  plot  replicate  to  sample  replicate  for  protein  determinations  in  corn.  The  analy- 
sis of  a  duplicate  sample  from  each  plot  would  have  boon  justified  in  producing  the 
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most  economical  design  only  if  the  cost  per  plot  had  been   108  times  the;  cost  per 
analysis. 

2.  Shrinkage  Samples  in  Re- rage:  Two  or  three  samples  per  plot  were  found  by  Wilkins 
and  Inland  (193^)  to  accurately  measure  the  water  content  of  forage  on  individual 
plcts  of  alfalfa  and  red  clover.  Samples  2  to  k-  "pound's  in  size  were  considered  best. 

3.  Purity  and  Germination  Tests  in  Seeds :  Tests  for  germination  and  purity  in  seed 
analyses  are  affected  "by  personal,  sample,  and  random  ei~rcrs .  Standard  rules  (I927) 
for  seed  testing  specify  minimum  sample  s3z.es  as  follows:   (1)  Two  ounces  of  grass 
seeds;  (2)  Five  ounces  of  red  or  crimson  clover;  alfalfa,  rye  grasses,  breme  grasses, 
mi.ll.et,  flax,  rape,  or  seeds  of  similar  size;  (3)  one  pound  of  cereal,  vetches,  or 
seeds  of  similar  size.  Collins  (1929)  -has  set  forth  the  procedure  for  statistical 
anal3'ses  of  purity  and  germination  tests  . 
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Questions  for  Discussion 


1.  Under  what  conditions  may  it  be  desirable  to  take  samples  rather  than  use  all  the 
available  material? 

2.  Are  yields  determined  from  samples  as  accurate  as  when  the  entire  plot  is  har- 
vested for  yield?  Why? 

3.  In  a  randomized  block  trial  what  makes  up  the  variance  due  to  varieties? 

k.   When  can  an  increase  in  the  number  of  samples  effect  an  appreciable  increase  in 
precision? 

5.  What  were  the  conclusions  in  the  shelled  com  data  on  the  number  of  samples  for 
protein  analysis? 

6.  What  factors  influence  the  number  of  samples  that  should  be  drawn  so  far  as  the 
precision  of  the  experiment  is  concerned? 

7-  Explain  how  to  determine  the  most  economical  design  from  the  standpoint  of  cost. 

8.  Under  what  conditions  are  quadrat  methods  used  for  small  grain  harvest? 
How  many  quadrats  are  usually  advised  for  l/lC-acre  plots? 

9.  Compare  the  rod,  meter-length,  and  area  quadrats  for  small  grains. 

10.  What  is  the  logical  sampling  unit  for  potatoes  or  sugar  beets?  How  many  samples 
should  be  taken? 

11.  Describe  how  to  use  a  pattern  method  of  sampling  for  sugar  beets  and  similar 
crops . 

12.  Describe  the  different  quadrat  methods  used  in  pasture  studies. 

13.  Give  sampling  precautions  and  technic  to  use  in  bulk  material. 

\k.   What  errors  may  be  introduced  in  seed  testing?  Give  the  sample  sizes  generally 
used  for  analyses. 


Problems 


Four  varieties  of  crested  wheat  grass  were  grown  in  a  randomized  block  trial  in  plots 
l/80-acre  in  size.  Four  replicated  plots  of  each  variety  were  grown,  the  yield  data 
being  obtained  from  six  quadrats  per  plot.  The  yields  in  pounds  per  square  yard  sam- 
ple are  given  below.   (Data  from  Dr.  T.  M.  Stevenson,  U.  of  Saskatchewan). 
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Yields  of  Crested  Wheat  Grass 


Square 

Yard 

Block 

Mecca:S-,l 

CW.G;  :S -10               CW.G.  :S-11 

CW.G.  :Uns 

(No.) 

(No.) 

(lbs.) 

(lbs.)                      (lbs.) 

(lbs.) 

1 

I 

0.52 

0.68                         0.48 

0.58 

2 

oJi-9 

0.62                         0.55 

O.58 

3 

0.59 

0.70                        0.46 

0.61 

4 

O.36 

0 . 70                         0 . 58 

O.63 

P 

G.28 

0.62                         0.51 

O.65 

6 

G.49 

0.66                         0.38 

O.71 

I 

IT 

0-.61 

O.77                         0.44 

0.68 

2 

0.49 

0.91                         0.48 

0.43 

3 

O.52 

0.89                         0.U9 

0.75 

4 

0.56 

0.95                         0.61 

O.71 

5 

0.57 

0 . 77                          0 . 58 

0.65 

6 

0.49 

0.77                           0.4l 

0.68 

1 

III 

0 .  52 

0.Q;:                              0.27 

0.42 

o 

0.42 

O.77                              0.61 

0 .  51 

5 

0.66 

0.46                        0.44 

0 .  58 

4 

0,57 

o.3i                       0.51 

0.54 

o 

0 .  59 

0.58                       0.61 

0 .  66 

6 

0.56 

0.35                            0.4l 

0.58 

T, 

17 

0.42 

0.70                       0.55 

0  .  JO 

2 

0.51 

0.37                       0.72 

0.30 

3 

0.V7 

O.53                          0.63 

0.44 

4 

0 .  '50 

0.60                       0.65 

0.66 

ir 

0.55 

0 . 64                         0 . 48 

0.63 

6 

0 .26 

'  0.33                         O06 

o.4i  . 

1.  Calculate  the  analysis  of  variance  for  the  crested  wheat  grass  yields  for  a  sub- 
division of  i 
within  plots 


division  of  the  total  variation  into  blocks ,  varieties,  error ,  and  square  yards 


2.  Compare  the  variance  due  to  samples  and  that  due  to  replications.  Make  a  state- 
ment on  the  number  of  samples  and  number  of  plot  replicates  that  you  would  recom- 
mend in  a  subsequent  experiment . 

3.  Determine  the  most  economical  design  from  the  data  at  hand,  i.e.,  the  ratio  of 
replicate  cost  to  sample  cost. 
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CHAPTER  XVII 
COMPLEX  EXPERIMENTS  ^/ 


I .  Use  of  Complex  Experiments 


The  present  trend  in  field  experiments  is  toward  somewhat  complicated  designs  which 
permit  the  study  of  several  factors  in  one  large-scale  comprehensive  experiment. 
There  are  several  advantages  of  the  complex  experiment.  (1)  One  that  includes  sev- 
eral treatments  in  all  possible  combinations  permits  a  broad  basis  for  generalization 
due  to  the  fact  that  the  interactions,  as  well  as  the  main  effects,  can  be  studied. 
It  is  obvious  that  field  experimental  results  may  be  influenced  materially  by  en- 
vironmental conditions  with  the  result  that  a  combination  of  factors  may  provide  a 
more  satisfactory  answer  to  the  problems  undor  study.  (2)  The  degrees  of  freedom 
for  error  variance  are  higher  than  would  be  the  case  for  single  experiments  designed 
to  study  each  factor  separately.  This  leads  to  greater  precision  in  the  results. 
(See  Paterson,  1939). 

The  value  of  a  complex  experiment  depends  upon  a  careful  analysis  of  the  problem  and 
the  various  treatment  combinations  to  be  tested.  The  amount  of  complexity  introduced 
depends  also  upon  the  facilities  and  funds  available.  It  is  a  safe  precaution  to 
key-out  the  degrees  of  freedom  for  the  various  factors  to  be  tested  before  field 
work  begins  to  be  sure  that  the  proposed  plan  is  satisfactory.  After  the  data  are 
collected  the  investigator  should  make  certain  that  the  data  are  sufficiently  homo- 
genous to  combine  in  a  single  test.  (See  paragraph  VII). 

H.  Application  to  a  Barley  Variety  Trial 

In  agronomic  tests  of  cereal  crop  varieties  it  is  often  desirable  to  conduct  the 
trials  at  various  points  in  the  area  under  consideration  and  to  carry  them  on  for  a 
period  of  years.  Some  data  collected  by  Immer  and  others  (193*0  on  the  yield  of 
barley  varieties  tested  in  randomized  blocks  in  h   locations  in  Minnesota  for  a  2 -year 
period  will  be  used  to  illustrate  the  method  of  computation.  The  data  are  based  on 
6  square -yard  samples  harvested  from  each  plot  of  approximately  l/^O-acre  each.  Each 
test  consisted  of  3  randomized  blocks.  The  same  5  varieties  were  tested  at  Univer- 
sity Farm,  Waseca,  Crookston,  and  Grand  Rapids  for  the  years  1932  and  1935.  The 
yields  in  bushels  per  acre  for  each  plot  of  each  variety  are  given  below  in  Table  1. 


H* From  Dr.  F.  R.  Immer,  University  of  Minnesota,  with  minor  modifications. 
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Table  1.  Yields  of  five  varieties  of  barley,  replicated  3  times  in  each  of  4  loca- 
tions in  1932  and  1935. 


Block 

Number 

Tot.    for 

I 

IT 

III 

Tot. 

I 

II 

III 

Tot. 

"both  years 

Univ.    P 

arm  -   1932 

Univ. 

Farm  - 

1935 

Manchuria 

19-7 

31.  4 

29.6 

80 .  7 

45.5 

50.3 

60.0 

155.8 

236 . 5 

Glabron 

28.6 

38.3 

•43.5 

110.  h 

47.5 

4.1.4 

49.4 

138.O 

248,4 

Velvet 

20.3 

27.5 

32.6 

80.  k 

54 . 2 

52.3 

64.5 

171.0 

251 ..  4 

Wis.   #38 

27.9 

4o.o 

46.1 

114 . 0 

62.2 

53.1 

74  •  7 

190.0 

304.0 

Peatland 

22.3 

30.8 

31.1      ■ 

84.2 

47.4. 

57.8 

50.5 

155.7 

239.9 

Total 


118.8      168.0      182.9        469-7      256.3    254.6     299.1 


810,3    1230.2 


Manchuria 
Glabron 

Velvet 
Wis.   #38 
Peatland. 


40.8 
44.4 
44.6 
39.8 
71.5 


Waseca 

29.4 

34 . 9 

41.4 
3Q.2 
47.  < 


1932, 
30.2 

35  •  9 
26.2 
29.1 
55-4 


Total 


241.1       .192.5       174 


100 , 4 
113.2 
112,2 
103.1 
174-5 
608.4 


53-9 
63-7 
5".  9 
74.2 
51.1 


Waseca   -   I.935 
58.8       47.7 
52,2 
56 . 4 

67.O 
45.O 


61 . 1 
59-1 
75-6 
47.3 


160 . 4 
177.0 
169.4 
216.3 
143 . 4 


260.8 
290.2 
281.6 
324.0 

317,-9 


296.8     301.  a     268.3 


867.0     14754 


Manchuria 

Glabron 

Velvet 

Wis,    #38 

Peatland 

Total 


34,7 
23.8 
29.8 
2.7.7 
43.O 

164.0 


Crooks ton 


1932 


2o.l 
28.7 
38.4 
27  ..6 

7,0  '7 
.■■  c-   •_  I 

156.5 


35.1 

2]. .  0 
28.0 
20 . 4 
32.0 

136.5 


98.0 

73.5 
96.2 

75.7 
107.7 
4-57,0 


0 rooks  1 

:on  -  1 

935 

42.1 

47.1 

30.3 

120 

0 

218.9 

38.8 

29 . 4 

30-5 

93 

7 

172.2 

42.1 

4  0 . 0 

39-3 

121 

0 

21.8.1 

44,3 

43.5 

47  •  7 

135 

5 

211.2 

53.9 

51 . 8 

50 . 3 

156 

0 

263.7 

21.2 

211 . 8 

199.1 

632 

1 

1089.1 

Man  churls 
Glabron 
Velvet" 
Wis.   #38 
Peatland 

Total 


20.2 
13.2 
24,5 

19.O 
27.6 


Grand,  Ra.pids 
~16~.< 


30,2 
20,5 
41.6 
18.4 
30 . 0 


9.6 


30 


Cr+  -  O 
OO    7 


1932 


104.5      140.7      103.5 


) 

4 

26 

Grand 

Rapids 

£>  •■'-' 

26.5 

32-7 

43 

3 

21 

h 

:18.7 

24.1 

96 

7 

1 

20 

7- 

26 . 8 

30 . 4 

62 

0 

20. 

7 

23-6 

30.9 

30 

3 

32- 

6 

4o.O 

34 . 2 

.43 

7 

122. 

0 

135.6 

152.3 

-    1  9" 


85.8        152, 


64.2 

77  •  9 

85*2 

106.8 


409.9 


107.5 
174.6 
137.2 
187.1 


'58 


p  < 


Total  4 
Stations 


628.4 


657-7 


597-7 


1883.0 


8Q6 


903.9     913.8        2719.5     4603.3 


III.   Analysis   of  Tes 


Into  Corrroonents 


The   analysis   of  a  complex   experiment  of  this   type    is  merely  an  extension   of  the 
analysis  of  variance   as  applied  to  the   randomized  block  test.      The   various   factors 


t  of  ether  with  their  degree 


of  freedom  may  be   represented  as   follows:      It   is  noted 


that  all  block  x  variety-1-    interactions  are   included   in  error, 


1 


The  symbol  (x)  in  this  connection  denotes  interaction, 
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Variation  due  to: 


Degrees  of  Freedom 


Slocks 

Stations 

Years 

Variet 

ies 

Interact' 

Lons  of 

Variet 

iei 

3  x  Stations 

it 

x  Years 

(i 

x  Stations 

x  Years 

Stations 

x  Years 

Blocks 

X 

Stations 

ii 

X 

Years 

»i 

X 

Stations  x  Years 

it 

X 

Varieties 

ii 

X 

Varieties  x 

Stations 

ii 

X 

Varieties  x 

Years 

it 

X 

Varieties  x 

Stations  x  Years 

) 

)  Srror 

) 

) 


2 

3 

1 
4 

12 

k 
12 

3 

6 

2 

6 

8  ) 
24  ) 

8  ) 
24  ) 


64 


Total 


119 


There  will  be  a  total  of  119  degrees  of  freedom  for  the  combined  test  since  there  are 
120  plots.  The  degrees  of  freedom  for  the  main  effects  will  be  one  less  than  the 
number  of  blocks ,  stations,  years ,  and  varieties ,  respectively.  The  degrees  of  free- 
dom for  interaction  is  obtained  by  the  multiplication  of  the  degrees  of  freedom  for 
the  variables  involved.  For  example,  varieties  x  stations  will  be  (4) (3)  =  12.  For 
the  second  order  interaction,  varieties  x  stations  x  years, the  degrees  of  freedom 
will  be  (4)(3)(l)  =  12.  After  the  degrees  of  freedom  are  keyed-out,  the  remainder 
of  the  computation  must  be  made  in  accordance  with  this  plan. 

IV.  Computation  of  the  Sums  of  Squares 

The  correction  factor,  (Sx)2/N,  is  computed  for  the  total  yield  of  the  120  plots,  i.e 
(1+603. 32/l20  =  176,586.4241.  This  factor  will  be  used  for  the  entire  test. 

For  the  total  sum  of  squares,  the  120  individual  plot  yields  are  squared  and  summed. 
This  value  (Sx2)  is  equal  to  200,  879-35.  The  correction  factor  is  then- subtracted, 
viz.,  200,879.35  -  176,586.4241  =  24,292.9259 XIII.  ^ 

The  data  for  the  remainder  of  the  computations  are  grouped  from  table  1  into  tables, 
each  with  two  variables.  The  sums  of  squares  are  computed  the  same  as  for  a  simple 
randomized  block  test.   It  is  to  be  noted  that  totals  are  used  for  the  variable  con- 
cerned. For  this  reason,  the  sums  of  squares  obtained  must  be  divided  by  the  number 
of  basic  plots  included  in  the  respective  totals  (to  reduce  the  variables  to  a  single 
plot  basis)  in  order  for  the  common  correction  factor  to  apply. 

The  combined  data  for  a  comparison  of  varieties  and  stations  are  given  in  table  2. 


TThe  roman  numeral  refers  to  the  line  in  Table  8. 
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Table   2.      Total  Yields  grouped  for  Varieties  and  Stations 


Barley 
Variety 


Total 


U.Farm 


1280.2 


Station 
Waseca  Crookston  Grand  Sapid s  Total 


Manchuria 

236-5 

260.8 

Glabron 

2-+8.1+ 

290 . 2 

Velvet 

251 .  h 

2.81.6 

Wis.    #38 

30U .  0 

32^.9 

Pea tl and 

239.9 

317.9 

IkiJ.k 


2.1.8.9 
177-2 
218.1 
211.2 
263-7 


1089.1 


152.2 

868.  h 

107.5 

323 . 3 

ljk.6 

925.7 

137.2 

977.3 

187.1 

1008.6 

758.6 


il-603.3 


These  data  are  taken  directly  from  the  right-hand  column  of  table  1 
of  the  sums  of  squares  is  carried  out  as  follows: 


The  computation 


Total 


(X2VJ    -    (Sx)2     =    1,129,020.73 


176,586.^1 
6  N  6 

=     183,170-1216  -  176,536-^2lil     =     11,583,6975 

Varieties  =  S(x2v)    -  jSx)2     =■     h, 261, 251.19     -     176,586*^2^1 

2k  N  2\v 

=   177,5521329  -  176,586-i+2'M  =  965-7088 
Stations  *  3(x2s)  -  (3x)2  =  5, 577, 329*97  -  176,586-^2'+! 


30 


30 


=     185,910.9990  -  176.586-1+2^1     =  9,52^.57^9. 

Varieties  x  Stations  =  Total    -    (Varieties   +  Stations) 

-     11  ..583.6975      -     10,290,2337     =  1.293.1+153 

The  values   for  varieties,    stations,    and  varieties  x   station  total   are    included  in 
table   7,   where   the    steps   for  computation  are   indicated.      The    interaction  values  are 
included  in  table  '8. 

The   sums   of  squares  for  the   other  factors  are   computed  in  a  similar  manner,    the   data, 
being  given   in  Tables  3>    K,    5?    ':n'l  6,   with  the   results   included  in  fable   'J. 

In  table   3  d^e   given  the   data  for  comparisons   of  varieties  and  years,    the   yields  at 
the   four  stations  being  totaled. 


Table  3. 

Total  yields  grouped 

r'or  varieties  and  years. 

Zear 

Variety 

1932                           1935 

Total 

Manchuria 

3I46 .  h 

roc    n 

863.4 

Glabron 

3I45 .  k 

)lT7       Q 

4   !    (    •   ~> 

825-3 

Velvet 

335.5 

5^0 . 2 

925 . 7 

Wis.   #38 

3C)0,.  3 

617-5 

977-3 

Peatland 

kke.i 

561 . 9 

1008  =  6 

Total 

1883.8 

2719.5 

1+603,3 
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In  table  4  are  assembled  the  data  for  comparisons  of  blocks  and  stations,  the  block 
totals  for  the  two  years  of  each  station  being  added. 


Total 


Table  4. 

Total  yields 

of 

blocks  and  stations 

Stations 

Block 

U.   Farm 

Waseca 

Crookston 

Grand  Rapids 

Total 

I 

II 

III 

375-6 
422.6 
482.0 

537-9 
494.4 

443.1 

385.2 
368.3 
355-6 

226.5 
276.5 
255.8 

1525.2 
1561.6 
1516.5 

1280.2 


1475.4 


1089.1 


758.6 


4603.3 


In  table  5  are  the  totals  for  comparison  of  blocks  and  years.  This  table  is  assem- 
bled from  the  totals  at  the  bottom  of  table  1. 


Table  5-  Total  yields  of  blocks  and  years 


Block 


I 

II 

III 


Year 


1932 


1955 


628.4 
657.7 
597-7 


896.8 

903.9 
918.8 


Total 


1525.? 
1561.6 

1516.5 


Total  1885.8  2719.5 

One  other  table   is  necessary,    that  of  stations  and  years, 


4603-3 


Total 


1280.2 


Table  6- 

Total  yields  of 

stations  and  years 

Station 

Year 

U.    Farm 

Waseca 

Crookston 

Grand  Rapids 

Total 

1952 
1955 

469-7 
310 . 5 

608.4 

867.0 

457.O 
632.1 

348.7 
409.9 

I883.8 
2719.5 

1475.4 


10 39.1 


758. 


4603 . 3 


The   calculation  of  the   sums  of  squares  for  the   complete  analysis  can  be  performed 
vith  the  least  difficulty  and  confusion  when  the   steps  are   carried  thru  in  a  routine 
manner.     Many  of  the   calculations  are  given  in  table  7.      The   remainder  follow  easily 
and  logically, 
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Table  'J.      Calculation  of  sums  of  squares. 


Total  of      Calc.  No.  of  Single  _      Sum  of       Key 

Varlate    Squares       from  Varia-  plots  in  (Sx)       Squares       to 

table  bles  each  tot.  N                   table 

squared  squared  -3 


4   5  ---•  1*4       6  '       5-7 


S    (x2)            200,879.35  1  120  1   2OO.879.35OO   176.586.^1   2Jf,292.0259  ■  III 

S   (x|)       5,577,329.97  2  k  30  185,910.9990  "  .  9;324.57i+9       II 

S  (x2y)  10,9^,382.69  3  2  60  I82,li06.3782  "  '•  5: ,819. 95^1     HI 

S    (x2v)     ^, 261, 25I.I9  2  5  2fc  177,552.1329  "  965.7086       IV 

s  (x\)     7,064,601.85  4  3  ko  176,615-0462  "  28.6221        I 

s  (x2sy)  2,897.377.01  6  8  15  193,156.4673  "  16,572,0432 

S   (x2  J  1,129,020.73  3  20  6  138,170.1216  "  11,533.6975 

To 

S    (x2,rv)   2,206,627-61  3  10  12  183,885. 63^2  "  7-299.2101 

S    (x2:,q)   1,871,824.37  4  12  10  l87,lo2.i+370  "  10,596.0129 

S    (>:2bv)   3,650,180.03  5  6  20   i82,509.00]5  "  5;922. 5774 

B(x%aJ)       5^3,855.03  1  h0  3  197,951.6767  ,!  21,365-2526 

S(x2b3y)       97^.,  322. 9^  1  24  5  194,864,5860  "  18,278.1619 


A  notation   found  to  be   very  convenient   in  practice    is  to  let  S(xL- J  be   the   sum  of  the 
squares   of  the    individual   plots.      The   station  totals  are   designated  xg,    the   variety 
totals  xv,    etc.      The   totals  for  varieties  at   the   separate   stations  are  designated  x.<rp. 
varieties   in  different  years  by  x-^r,    etc.      These   are   given   in  table   7.      In   column  1 
of  this   table   are   the   sums   of  the   squares   of  the   varlate s   concerned  and  in   column  2 
is  given  the   tabic   from  which  they  havj  been  computed.     Thus,    S-(x^)   is  calculated  Iron 
the   4  station  totals  of  table   3-      S(x5)    is  calculated   Prom  the  variety    totals  of 
table   2.      The   value   of  S(x2s)    is   the   sum  of  the   rjquarcs   of  the    20  yields  .for  each 
variety  at  each  station  separately   in  table   2. 

Column  3   of  table   '(  gives   the  number  of  figures   squared  under  column  1.      Column   '-' 
gives   the  number  of   single   plots   contained   in  each   figure   squared.      Column   5    is 
simply   columns  1   divided  by  4.      This   is  necessary  to  reduce   the    sums   of  square- s  to  a 
single   plot  basis  throughout.      The   key  numbers  refer  to  that   sum  of  squares   in  the 
complete   analysis  of  variance  given  as    table   8. 

The    sums   of  squares   for  total,    blocks,    stations,,    years  and  varieties  are    transferred 
directly   from  table   7   "to  table   8.      The    interaction  sums   of  squares   can  be    obtained 
from  table   7  by  subtraction. 

The-   sum  of  squares   for  Interaction  of  varieties  x  stations  will   be    found  by   subtrac- 
tion of  the   sum  of  squares   for  varieties  and  stations   separately   from   the   suma   of 
squares   opposite  S(x2o)    in  table   ?■    e.g. 
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11,583.6975  (19  D.F. ) 

-  965.7088  (  k  P.P.    for  varieties)   IV 

-9,324. 5749  (  5  D-F.  for  stations) --II 

1,293.^138  (12  D.F.  for  varieties  x  stations)-  V 

Since  there  were  20  figures  used  to  obtain  S(x^  )  there  would  be  19  degrees  of  free- 
dom. The  interaction  degrees  of  freedom  are  obtained  by  subtraction,  e.g.,  19  - 
(4  +  3)  =  12.  All  otner  first  order  interactions  are  obtained  in  the  same  manner. 

The  second  oraer  interaction  of  varieties  x  stations  x  years  is  obtained  by  subtrac- 
tion from  the  sum  of  squares  opposite  S(x2va  )  in  table  7  the  sums  of  squares  for 
varieties,  stations  and  years  separately  and  their  first  order  interactions  in  all 
possible  combinations.  Thus: 


21,365.2526 
-965.7088 
-9,32^.57^9 
-5,819.95^1 
-1,293.^138 
-513.5^72 
-1,427-5142 


(39  D.F.) 

(  4  D.F.  for  varieties) ,-— IV 

(  3  D.F.  for  stations) . II 

(  1  D.F.  for  years)- -III 

(12  D.F.  for  varieties  x  stations) V 

(  4  D.F.  for  varieties  x  years) VI 

(  3  D.F.  for  stations  x  years) VIII 


2,020.5396  (12  D.F.  for  stations  x  years) VII 

The  sums  of  squares  for  the  main  effects  and  the  first  order  interaction  for  the  com- 
putation of  the  second  order  interaction  are  taken  from  table  8  opposite  the  appro- 
priate key  number.  The  interaction  of  blocksx  stations x  years  is  obtained  in  a  simi- 
lar manner. 

The  complete  analysis  is  now  carried  out  in  table  8,  the  error  sum  of  squares  being 
obtained  as  a  remainder. 


Table  8.  Complete  analysis  of  variance 


Key 

No, 


Variation  due  to: 


D.F, 


Sums  of 
Squares 


Mean 
Square 


F -Valued 


I 

II 

III 

IV 


Blocks 
Stations 
Years 
Varieties 


2 

3 

1 


28.6221 

9,324.57^9 

5,319.9541 

965-7038 


14.3110 

3,103.1916 

5,819-9541 

241 . 4272 


162.85** 

304 . 92** 

.12.65** 


V 

VI 

VII 

viii 

IX 

X 
.  XI 


XII 
Error 


Interaction  of: 

Varieties  x  stations  12 

Varieties  x  years  4 

Varieties  x  stations  x  years  12 

Stations  x  years  3 

Blocks  x  stations  6 

.    Blocks  x  years  2 

Blocks  x  stations  x  years  6 

(Blocks  x  varieties  8) 

(Blocks  x  varieties  x  stations  24) 

(Blocks  x  varieties  x  years  8)  64 
(Blocks  x  varieties  x  3ta.x  yrs.24) 


1,293-4133 

513.5472 

2,020.5396 

1,427-5142 

1,242.8159 
74.0012 

360.6795 


1,221.5546 


107.7345 
128.3868 

168. 3783 

475-8331 

207.1360 

37.0006 

60.1132 


19.0868 


5.65** 
6 . 73** 
8.82** 
24 . 95** 
10 . 35** 
1.94 
3.15** 


XIII 


Total 


119         24,292.9259 


**Exceeds  the  1  per  cent  point  in  Snedecor's  table  of  "F". 
^For  comparison  with  error. 
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In  practice,  it  is  unnecessary  to  key-out  the  complete  analysis  as  given  in  table  8. 
The  variation  due  to  blocks.,  blocks  X  stations,  blocks  x  years,  and  blocks  x  stations 
x  years  should  be  grouped  as  one  quantity,  being  designated  as  "Blocks  within  sta- 
tions and  years"  or  "Blocks  within  tests"  (for  16  D.F.).  The  reason  for  this  is 
readily  apparent.  The  blocks  are  numbered  I,  II,  and  III  arbitrarily.  Block  I  at 
University  Farm  has  no  relation  to  Block  I  at  Wasoca  or  any  other  station.  Thus,  it 
is  an  error  to  regard  blocks  as  a  factor  that  occurs  at  several  definite  levels  (3 
in  this  case).  The  correct  procedure  is  therefore  to  compute  the  block  sums  of. 
squares  for  each  experiment  and  combine  them  to  present  in  the  final  analysis.  The 
analysis  of  variance  may  then  be  presented  as  in  table  9- 

Table  9-  Analysis  of  Variance  (in  summary  form) 


Variation 

Sums 

■  Mean 

due   to : 

I)  F. 

Squares 

Square 

F-Valu..> 

Blocks  within  Tests 

16 

1,706.1187 

106.6324 

5 .  trqtt-H- 

Stations 

3 

9,324.5749 

3,108.1916 

162 . 85** 

Years 

1 

5,819.95^1 

5,319.9541 

30  4 .  92.** 

Varieties 

4 

965.7088 

241.4272 

12. 6 5** 

Interactions : 

Varieties  x  stations 

12 

1; 293- 4138 

107  7845 

5.65** 

Varieties  x  years 

k 

515.5472 

128.3868 

6.75** 

Varieties  x  stations  x  years 

12 

2,020.5396 

168.3783 

3, 82** 

Stations  x  years 

3 

1,427.51^2 

475,8381 

24 . 93** 

Error 

64 

1 . 221 . 5546 

19.0868 

Total 

119 

24,292.9259 

The  analysis  may  be  summarized  still  further  in  case  the  investigator  is  not  inter- 
ested in  the  variation  due  to  stations,  years,  and  stations  x  years.   He  may  group 
these  factors  into  variation  due  to  "tests  within  stations  and  years"  or  simply 
"tests"  (for  7  B.F.).   The  variety  factor  and  its  interactions  would  be  given  in  de- 
tail because  it  has  definite  biological  significance. 


V .  Sums  of  Squares  in  Simple  vs .  Complex  Experiments 

It  will  be  useful  at  this  stage  to  relate  the  complete  analysis  in  Table  i 
simple  randomized  bloc]':  tests  computed  for  each  of  the  8  tests  separately. 


with  the 
The  sums 


of  squares   for  total,   blocks,    varieti 

.es,  aid.  error  are 

siven  in  table   10   for   the   8 

tests. 

Table   10 .    Sums  0  f 

squares   calculated 

from  the   tests   8' 

iparately 

Sum  of  squares 

Sum  of  squares 

Sum  of  squares     Sum  of  squares 

Test                 Year 

for  total 

for  blocks 

for  varieties          for  error 

U.    Farm     -     1932 

867.2973 

450 . 90 73 

3 75. 61 06                      41.5894 

U.    Farm     -     1935 

1031.7133     ' 

251 • 6253 

506 . 3600                  273 . 7280 

Waseca        -     1932 

I907.I36O 

471 . 3960 

1196.3293                  239-4107 

Waseca        -     1935 

1203.5600 

131.1480 

993.8400                   78.5720 

Crooks ton-     1932 

487.8733 

80 . 8333 

..    252.6266                 154.4134 

Crookston-     1935 

807.3360 

49.2040 

595-3227                  162.3093 

G.   Rapids-     1932 

905.5573 

179.6353 

536.1640                  189.7080 

G.'  Rapids-     1935 

509.9093 

92.1293 

336.4560                    31,3240 

Total 

7,720,8825 

1,706.1185 

.    4,793.2092              1,221,5548 
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It  is  noted  that  the  sums  of  squares  for  error  for  the  8  separate  tests  adds  to 
1,221-5548.  This  agrees  with  the  sum  of  squares  for  error  of  oable  8  (1,221-5546) 
the  discrepancy  "being  due  to  dropping  of  decimals.  There  were  8  D.F.  for  error  in 
each  separate  test  or  8  by  8  =  64  in  all  tests.  These  same  64  D.F.  were  used  for 
error  in  the  complete  analysis  in  table  8.  The  error  used  in  table  8  is,  therefore, 
simply  the  average  error  of  the  separate  tests. 

The  sums  of  squares  added  for  blocks,  in  the  8  tests,  gives  1,706.1185  (see  table  10) 
Addition  of  the  sums  of  squares  for  blocks,  blocks  x  stations,  blocks  x  years,  and 
blocks  x  stations  x  years  from  table  8  gives  a  total  of  1,706.1187,  which  agrees  al- 
so. Further  comparisons  are  given  in  table  11. 

Table  11.   Comparison  of  degrees  of  freedom  and  sums  of  squares  of  the  8  separate 
tests  with  the  complete  analysis  of  table  8. 


Variation 
due  to: 


Calculated  from  8 
separate  tests 


Calculated  from  the  complete  analysis 


D.F. 

Sum  of  sq. 

D.F. 

Sura  of  Sq. 

Key  to  table  8 

Blocks 

Varieties 

Error 

(8)(2) 

(8)00 

(8)(8) 

=  16 
=  32 
=  64 

1,706.1135 
4,793.2092 
1,221.5548 

16 
52 
64 

1,706.1187 
4,795.2094 
1 , 221 . 5546 

I,   IX,  X,   XI 
IV,  V,   VI,   VII 
XII 

Total 

(8)  (HO 

=112 

7,720.8825 

112 

7,720.8827 

I,IV,V,VI,VII,IX, 
X,X1,XII 

From  the  above  table  the  analogy  between  the  separate  analyses  of  variance  for  each 
test  and  the  complete  analysis  is  clear.  The  112  D.F.  for  total  in  table  11  is  the 
total  sums  of  squares  within  tests.  When  the  7  D.F.  between  tests  (i.e.,  stations  = 
5,  years  =  1  and  stations  x  years  =  5)  are  added,  the  full  119  D.F.  is  obtained.  The 
same  is  true  for  the  sums  of  squares. 

VI.  Interpretation  of  the  Data 


The  manner  in  which  the  data  can  be  interpreted  will  now  be  illustrated.  From  table 
8  it  is  3een  that  the  variance  (mean  square)  due  to  varieties  compared  with  variance 
due  to  error  exceeds  the  1  per  cent  point.  Therefore,  some  of  the  varietal  differ- 
ences are  significant.  The  mean  yields  for  varieties,  computed  from  Table  1,  are 
given  in  table  12. 
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Table  12.  Mean  yields  of  3  plots  of  each  variety,  average  yields  for  both  years  and 
average  yields  of  varieties  for  all  tests. 


Year 


1932 
1933 


Manchuria 


26.9 

31.9 


Variety 


Glabron 


Velvet 


Wis.  #38 


University  Farm 

~  36.8        26.8 

_  46*0 37.0 


38.0 
63.3 


Peatland 


>1.9 


Mean  yield 


39^ 


41.4 


4  1.9 


50.7 


40.0 


1932 


Mean  Yield 


33-5 

53.5 


43. 5 


Waseca 


37-7 

4875 


37-^ 
56 . 5 


46.9 


2D 


.0 
54.2 


U7.H 

L.'-- 


53. 


1932 
1935 


Mean  Yield 


35.0 
4o.o 


36 


5 


Crooks ton 
26.2  * 
32.9 


29-5 


32.1 
40.6 
36  IT 


25, 


35- 


35-9 
,_52,0_ 

44.0 


1932 
1935 


Mean  Yi 


22.1 

cJ.  O 


25.4 


Grand  Rapids 
14.4 

21.  4 

17-9 


JC.  c. 

26.0 


?9.1 


20.7 
25-1 
22.9 


26.  o 
31.2 


Mean  for  all 
stations 


~*>6.2 


^ 


58.6 


40 . 7 


The  variance   due   to  error  was  1 9-0868   (table   8).      The   standard  error  of  a  single  plot 
would  beVl9<0868  =   4.369  bu.      Since   24  plots  are   involved  in  the   variety  averages 


Qno 


DU , 


for  all  stations,  the  standard  error  of  the  mean  of  24  -plots  is  4.369  =  0.89 

The  standard  difference  between  two  such  means  would  be  0. 892V 2  =  1.26  bu.  With  64 
D.F.  for  error  one  may  accept  twice  the  standard  error  of  the  difference  as  a  level 
for  odds  of  approximately  10:1  against  the  chance  occurrence  of  a  difference  of 
(2)(1.26)  -  2.52  bu. 


Since  the  mean  yield  of  Peatland  for  all  stations  and  years  was  42.0  bu.,  any  variety 
that  differs  from  it  by  more  than  2.5  bu.  may  be  judged  as  probably  significantly 
lower  in  yield,  on  the  basis  of  these  tests  alone.   On  this  basis  Manchuria,  Glabron 
and  Velvet  are  significantly  lower  in  yield  than  Peatland. 

The  interaction  of   varieties  x   stations  was  also  significant  (table  8).  A    first  order 
interaction  is  essentially  a  difference  between  two  differences.   The  mean  yield  of 
Peatland  at  University  Farm,  for  an  average  of  both  years,  was  10. 7  bushels  less  than 
the  yield  of  Wisconsin  #38  (50,7  -  40.0  =  10.7).   The  mean  yield  of  Peatland  at  Grand 
Rapids  exceeded  the  yield  of  Wisconsin  #38  by  3.3  bu.  (31.2  -  22.9  -  8.3).   The  ques- 
tion then  is  whether  these  two  differences  art.  significantly  different.   This  differ- 
ence between  two  differences  will  be  given  by  Wisconsin  #38  minus  Peatland  at  Univer- 
sity Farm  less  Wisconsin  #38  minus  Peatland  at  Grand  Rapids,  or  (50, 7  -  4-0,0)  - 
(22.9  -  31.2)  =  I9.O  bu.   The  standard  error  of  this  "cross  difference "  will  be 
/l9.086S  x  g"x  2  =  3.367. 
V      6 


MB 
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It  may  also  be  computed  as  follows:  The  standard  error  of  the  mean  (o~f )  is  equal  to 
/19.0868  =  1.784  bu. ,  since  6  plots  are  contained  in  each  mean.  The  standard  error 

of  the  difference  between  two  differences  then  is  1.784  V2'VF=  3-567  bu.  Twice  this 
is  7.13  bu.  and  any  "cross  difference"  that  exceeds  this  value  is  expected  to  occur 
less  than  once  in  20  trials  by  random  sampling  alone.  The  cross  differences  for 
Peatland  and  Wisconsin  #38  at  University  Farm  and  Grand  Rapids  greatly  exceed  7*13 
bu.,  being  19-0  bu.   It  is  clear,  therefore,  that  these  two  varieties  responded  in  a 
differential  manner  at  University  Farm  and  Grand  Rapids  as  an  average  of  1932  and 
1935-   Other  significant  cross  differences  could  be  found  in  the  same  way. 

Significant  interactions  of  varieties  x  years  could  be  determined  by  application  of 
the  general  procedure  outlined  above.   Since  only  two  years  are  involved  these  inter- 
actions of  varieties  x  years  can  have  very  little  practical  significance. 

While  the  second  order  interaction  of  varieties  x  stations  x  years  was  also,  signifi- 
cant, it  is  of  secondary  interest.  This  significant  second  order  interaction  merely 
means  that  certain  differential  responses  of  varieties  x  stations  were  not  constant 
in  different  years.  To  illustrate  the  types  of  comparisons  which  must  be  made  to 
show  this,  take  th^  means  of  Glabron  and  Velvet  at  University  Farm  in  1932  and  1935 
separately  and  the  same  yields  at  Grand  Rapids.  Then:  f  (56. 8  -  26.8)  -  (46.0  -  5"J0)]- 
£(l4.4  -  32.2)  -  (21.4  -  26.0)  ]  =  34.2  with  an  error  of  /J9-OQ68  x  2  x  2  x  2  or 
7.13  bu.  V "       3 

Since  the  difference  of  34.2  exceeds  (2)(7.13)  =  14.26  bu.  it  is  obviously  significant. 

For  a  complete  understanding  of  a  complex  analysis,  of  which  that  given  in  table  8  is 
an  example,  one  further  comparison  can  bu  made.  Suppose  that  V,  S  and  Y  are  designat- 
ed to  represent  variance  due  to  varieties,  stations  and  years  and  V  x  S,  V  x.  Y  and 
V  x  S  x  Y  the  interaction  variances.  Then  one  may  determine  whether  the  variance  due 
to: 

V  >  V  x  S  >  V  x  S  x  Y 

>  Error 

V  >  V  x  Y  >  V  x  3  x  Y 

by  means  of  the  "F"  test.  When  the  variance  due  to  varieties  significantly  exceeds 
the  interaction,  varieties  x  stations,  there  is  evidence  that  varietal  performance 
generally  was  consistent  enough  to  demonstrate  that  some  varieties  were  the  bust  in 
all  stations,  as  an  average  of  the  years  in  which  tests  were  made.  When  the  variety 
variance  significantly  exceeds  that  of  varieties  x  years,  one  may  conclude  that,  as 
an  average  of  all  stations,  some  varieties  were  consistently  better  in  yield  in  the 
different  years. 

Further,  when  the  interaction  of  varieties  x  stations  significantly  exceeds  varieties 
x  stations  x  years,  it  is  plain  that  the  differential  response  of  the  varieties  at 
the  separate  stations  were  sufficiently  similar  in  the  different  years  to  warrant  the 
conclusion  that  these  d.ifferential  responses  may  be  permanent  features  of  these  local- 
ities. 

Unless  the  variance  for  varieties  significantly  exceeds  that  of  varieties  x  stations 
or  varieties  -x  years,  no  general  recommendations  can  be  made  for  the  entire  state  or 
for  future  years.  To  make  such  recommendations  the  stations  (of  which  tests  were 
made)  are  considered  as  random  samples  of  all  places  1n  the  state  and.  the  years  in 
which  tests  were  conducted  must  be  considered  as  a  random  sample  of  all  future  y^ars. 
It  is  only  when  the  number  of  stations  and.  years  can  be  considered  an  adequate  sample 
of  all  possible  places  and  years  that  worthwhile  predictions  can  be  made  for  all 
places  in  the  state  and  for  future  years.   (Sec  Summerby,  1937).  • 
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VII.  The  Homogeneity  Test 

The  question  may  be  raised  as  to  whether  the  data  afforded  by  the  several  experiments 
are  sufficiently  alike  to  assume  that  they  may  have  resulted  from  a  single'  population. 
In  case  this  is  true,  the  data  from  the  experiments  may  he  consolidated  and.  analyzed, 
as  one  complex  experiment.  Homogeneity  tests  have  been  suggested  by  Sne decor  (1937) 
and  by  Stevens  (1936). 

The  formula  may  be  explained  as  follows: 

Let  n  -  the  number  of  experiments. 

n  -  1  =  the  number  of  degrees  of  freedom  between  experiments. 

M  =  the  number  of  degrees  of  freedom  for  error  within  an  experiment. 

e  :-  the  sum  of  squares  for  error  in  a  single  experiment. 

v  =  the  observed,  variance 

V  -  the  theoretical  variance 

L  =  the  Lexis  ratio 

The  observed  variance  of  the  sums  of  squares  due  to  error  for  all  experiments  is: 

▼  =  S(e  -  e)2  =  S(e2)  -  (Se)2/n 
n  -  1  n  -  1 

The  theoretical  variance,  where  the  total.,  S(e),  is  assumed,  to  be  the  population  of 
error  sums  of  squares,  is  as  follows: 


2M 


5(e) rp  or    2_  "  Sje)' 

M  (n-l_)j        m  _   n  . 


o 


The  Lexis  ratio  (L )  is  the  ratio  of  the  observed  to  the  theoretical  standard  error, 
so  that  its  square  is  L2  -  v/V.  When  the  ratio  is  greater  than  one.  the  series  of 
sums  of  squares  due  to  error  is  called  supernormal.  When  "L"  is  less  than  one,,  it  is 
called  subnormal . 

A  certain  degree  of  supernormal ity  or  subnormal! ty  can  be  attributed  to  cnance.   The 
limits  for  significance  can  be  determined  by  the  X,2  test,  viz., 

X2  =      (n-l^L2  or  (n-l)  (v) 

(vO 

When  X2  corresponds  to  a  probability  of  less  than  0.05,  the  series  is  too  supernormal 
to  admit  that  they  resulted  from,  a  single  population.   AX2  that  corresponds  to  a 
probability  greater  than  0.°5  indicates  that  the  series  is  too  subnormal  :f'or  consoli- 
dation of  the  data. 

The  homogeneity  test  can  be  applied  to  the  data  on'  the  barley  yield  trials  as  compiled. 
in  the  separate  tests  in  table  10 : 

Test  Year  Gums  squares  due  to  error 

U.  Farm  1932                                    '+1.5894 

U.  Farm  1935                                .   275.72a') 

Waseca  1QJ2  239A107 

Waseca  1935                                   78. 5720 

Crooks  ton  1932     .  i5JJ-.il  3** 

Crookston  1935  162.8093 

Grand  Lap  ids  1932  I89.708O 

Grand  Rapids  1935                      •             81.32^0 


n  -  8,  (n-1)  =  7,  M  =  8.  S(e)    1221. 55^8 


nH 
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The  sinus  of  squares  for  the  eight   "error  sums  of  squares"  is  as  follows: 

S(e2)     =     S      [(41. 589M2     +       (273-7280)2     +        (81.3240)2] 

=  233,100.8233 
(Se)2/n  =  l86,5?4.5773 
S(e  -  s)2  =  S(e2)  -  (Se)2/n  =  46,576.2460 

v  =  S(e  -  5)  =  46,576.2460  =  6,653-7494 
n-1  7 


V  =  2M 


S(e)" 


nM 


=  16 


1221.5548 
L (B)(8)   . 


2   =  16  (19.0868)2 


=  (16)  (364.3059)  =  5828.8944 


X2  =  (n-l)fv)  =  (7)  (6655.7494)  =  7. 9906' 
V      5828.8944 

When  theX2  table  is  entered  for  7  degrees  of  freedom,  P  =  0.3335- 

Therefore,  the  data  are  sufficiently  homogenous  to  permit  the  calculation  of  one 
generalized  standard  error  for  all  tests. 

VIII.  Transformation  of  Percentage  Data 

Some  types  of  discrete  data  cannot  "be  comMned  to  provide  a  valid  estimate  of  a 
generalized  standard  error.  This  applies  particularly  to  some  forms  of  percentage 
data  wherein  each  variate  represents  a  certain  number  of  observations  of  a  given  type 
or  condition  out  of  a  total  number  of  trials  or  cases  (N).  The  variance  of  a  single 
variate  of  this  type  is  pqN.   It  is  clearly  dependent  upon  p,  the  estimated  ratio  of 
existence  of  the  type  or  condition  in  question,  as  well  as  upon  N.  Bliss  (1937), 
Salmon  (1938),  Cochran  (1938),  Clark  and  Leonard  (1939),  and  others  have  recognized 
that  each  variate  in  discrete  data  of  this  kind  does  not  have  the  same  opportunity  to 
contribute  equally  to  a  general  experimental  error. 

(a)  The  Angular  Transformation 

R.  A.  Fisher  has  supplied  a  mathematical  transformation  for  such  data  which  will 
equalize  the  estimated  variance  of  each  variate  so  that  it  is  functionally  dependent 
only  on  N,  the  total  number  of  trials.   In  this  transformation,  each  estimate  of  p  is 
replaced  by  sin2  -Q-  whence, 

-0-=  Sin  "7p   or  1/2  Cos"1(l-2p) 

This  transformation  must  be  applied  to  discrete  data  of  this  type  so  that  the  analy- 
sis of  variance  may  be  valid.  However,  it  is  of  little  practical  importance  when  the 
percentage  values  are  between  30  and  70.  Bliss  (1937)  has  compiled  a  convenient 
table  for  the  transformation  of  percentage  values  to  angles,  the  latter  being  measured 
in  degrees  (See  Table  5,  appendix). 

(b)  Classification  of  Percentage  Data 

The  type  of  discrete  data,  rather  than  its  expression  in  percentages,  determines 
whether  or  not  the  transformation  should  be  employed.  The  types  of  percentage  data 
are  classified  by  Clark  and  Leonard  (1939)  as  follows:   (1)  Continuous  data  from  an 
experimental  study  may  be  expressed  as  percentages  when  each  variate  is  divided  by  an 
arbitrary  constant  value,  whereby  each  variate  becomes  a  percentage  of  some  standard 
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or  average.   Clearly  such  a  procedure  merely  transforms  the  unit  of  measurement.   Per- 
centages of  this  type  should  be  treated  statistically  exactly  as  though  the  data  were 
in  their  raw  form.   For  example.,  yield  data  might  "be  expressed  in  percentage  of  the 
check  instead  of  actual  yield  in  pounds.   (2)  Continuous  data  are  often  expressed  in 
percentages  to  show  concentrations.   This  type  of  percentage  is  very  common.   Some 
examples  are:   seed  purity  given  by  weight  of  pure  seed/  total  weight  of  seed,  leafi- 
ness  given  by  leaf  weight/  total  plant  weight,  protein  content  given  by  weight  of 
protein/  total  weight;  sugar  content  given  by  weight  of  sugar /weight  of  root.  etc. 
Such  concentrations  should  not,  as  a  rule,  be  subjected  to  any  transformation  to 
equalize  the  variance.   (3)  The  third  type  of  percentage  is  where  the  original  data 
are  discrete,  being  based  upon  a  determinate  number  of  trials  or  cases  (N).   The 
transformation,  p  =  sin^fr  should  be  applied  to  this  type  where  it  is  desired  to  con- 
struct a  generalized  standard  error.   Illustrations  of  this  type  are  as  follows: 
Germination  percentages  given  by  number  of  seeds  germinated/  total  seeds,  disease 
percentages  given  by  number  of  plants  diseased/  total  plants,  etc. 
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Questions  f or  Discussion 

1.  What  are  the  advantages  of  a  complex  experiment  over  separate  single  tests? 

2.  As  a  matter  of  design,  would  it  be  necessary  to  have  all  varieties  in  all  locations 
in  each  year?  Why? 

3-  Why  does  the  total  sum  of  squares  for  the  simple  tests  fall  short  of  the  total  sum 

of  squares  for  the  complex  experiment?  What  would  make  them  check? 
4.  How  would  you  interpret  a  significant  interaction  such  as,  for  example,  varieties 

x  stations? 
5-  Explain  why  a  first  order  interaction  is  essentially  a  difference  between  two 

differences. 
6.  What  is  a  homogeneity  test?  Why  should  it  be  made? 
7-  Under  what  conditions  should  percentage  data  be  transformed  to  degrees  of  an  angle 

to  admit  valid  use  of  a  pooled  estimate  of  error? 
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Problems 

1.   The  yields  in  "bushels  per  acre   for  five   spring  wheat  varieties  tested  in  3  rando- 
mized "blocks  for  3  years  are  given  below.      (Data  from  F.  R.    Immer). 


Block  Number 
I           II         III 

Tot. 

Block  Number 

Block  Number 
I         II       III 

Grand 

Variety 

I         II         III       Tot. 

Tot.      Tot. 

Thatcher 

Ceres 

Reward 

Marquis 

Hcpe 

U.   Farm  -  1951 

17.0  20.0     19.7     56.7 

16.1  18.9     20.5     55.5 
21.1     25.1     21.8     66.0 
15.4     20.9     18.4     54.7 
20.3     21.0     14.2     55.5 

U.  Farm  -  1932 

33.6  37.7     31.2  102.5 

29.7  30.0     55.9     95.6 
24.1     26.9     29.8     80.8 
26.3     31.3     29.8     87.9 
28.1     25.4     51.5     85.O 

U.   Farm 
32.4     3^.3  37-3 
20.2     27.5  25.9 
29.2     27.8  30.2 
12.8    12.3  14.8 
21.7     24.5  23.4 

-  1955 

104.0  263-2 
73-6  224.5 
87.2  234.0 
39-9  182.5 
69.6  210.1 

Total 

89.9  103-9 

Waseca  - 
26.8     35-6 
29-2     32.4 
23.3     22.8 
26.2     28.8 
22.6     21.0 

94.4  288.2 

1931 

26.4  86.8 
26.0     87.6 

18.5  64.6 
25.5    78.3 
24.2    67.8 

141.8  151.8  153.2  451.8  116.5  126.41516  574.51114.3 
Waseca  -  1932                   Waseca  -  1935' 

Thatcher 

Ceres 

Reward 

Marquis 

Hope 

22.3     20.7     25.5     67.0 
24.3     26.2     26.7     77-2 
27.2     24.9     24.6     76.7 
27.8    26.5     24.0     73.1 
24.0     25.7     23.3     71.0 

28.5  30.1  30.1 

15.6  14.5  14.5 
13.0     22.4  25.2 
15.4      6.4     4.9 
23.0    29.0  25.5 

88.7  242.5 
44.4  209.2 
•65.6  206. 9 
24.7  181.1 
82.3  221.1 

Total        128.1  138.6  113.4  585.I  126.1  121.8  122.1  570.0  105.5  102.4  99-8  505-7  1060.8 


Thatcher 

Ceres 

Reward 

Marquis 

Hope 


59-0 
54.6 

52.5 
51.4 


Crookston  -  1951 

30.4  lOf.O 


37-6 
37- 4^ 

31.3 

26.4 


33-7  105.7 
29.3  93-6 
30.5     88.3 


Crookston  -  1932 


Crookston 


25.1  15.2 

51.1  19.5 

23.1  22.8 

20.1  19.2 


20.8 
20.9 
19.8 
15.5 


59-1 
71.5 
65-7 
54.8 


27.0 
15.4 

16.8 

5-* 


24.2  17.5 
11.0  11".  5 
16.4  14.6 
5.9    8.4 


1935 

~£o\7  234.8 
57-9  215.1 
47.8  207.1 
17.7  160.8 


27.8    50.6    29.4    87.3    19.5    25.2    20.8    65.5    18.0    18.5  15-0    51.5  204.6 


Total        165- 5  165-3  153-8  482.4  116. 9  101. 9    97-3  316.6     32.6     75.8  67.0  225-410224 


Total        383.5  405.8  566.6  1155.7  584.3  575-5  573.11153.4  502.4  502.6  29814  905. 4  5197-5 

(a)  Calculate  the  analysis  of  variance  for  the  complete  study. 

(b)  Test  the  significance  of  the  different  mean  squares  compared  with  the  error 
variance;  using  the  F  test. 

(c)  Compare  Thatcher  with  Ceres,  as  an  average  of  all  tests,  using  the  standard  error 
of  the  difference. 

(d)  What  would  be  the  standard  error  for  testing  the  significance  of  the  interaction 
between  Thatcher  and  Ceres  in  1952  and  1935>  as  an  average  of  all  stations?  Make 
the  proper  test  of  significance. 

(e)  Do  the  same,  as  under  (d),  for  comparing  Thatcher  and  Ceres  at  University  Farm 
and  Waseca,  as  an  average  of  all  years.   Is  this  interaction  signigicant? 

(f)  Calculate  the  sums  of  squares  for  blocks,,  varieties,  error  and  total  for  each  of 
the  9  separate  tests  and  add  the  different  components  for  all  9  tests.   Compare 
these  sums  of  squares  with  appropriate  combinations  in  the  complete  analysis  of 
variance  table. 
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2.   Test  the  data  in  problem  1  for  homogeneity. 

3-  The  relative  infection  in  different  varieties  for  5  bunt  collections  were  as  fol> 
lows  (Data  from  Salmon .  1^38): 


Variotv 


Bunt  Oro  Rid it  Alb it  Turkey 

Collection (l)^      (g) [l] _[2j_ £U (2) (l)     (?) 

(Pet.)  (Pet.)   (Pet.)  (Pet.)    (Pet.) (Pet.)    (Pet.) (Pet.) 


1  0.0  0.9  6.3  3.9  0.0  0.0  8.3  6.5 

2.  2.5  3.6  8.7  2.2  92.  U  90.5  89.O  8'4.3 

3  1.5  0.0  6.0  0.7  93.7  90.1  3-3  6.0 

k  1.5  6.3  k.l  3.1  1U.0  lj.5  81.7  87.2 

5  0.6  1.7  3.9  3.6  U,2  3-2  7,5  2.U 


(a)  Transform  those  percentage  data  to  degrees  of  an  angle  and  compute  the  analysis 
of  variance  for  varieties,  replicates,  and  collections. 

(b)  Compute  the  data  without  the  transformation  and  compare  the  results  with  (a). 


vThese  numbers  refer  to  replicates, 


'  CHAPTER  XVIII 
THE  SPLIT -PLOT  EXPERIMENT  ^  ". 

I.  Use  of  Split  Plot  Experiments 

Sometimes  it  is  an  advantage  to  use  relatively  large  plots  for  one  series  of  treat- 
ments and  sub-divide  these  whole  plots  into  a  number  of  sub-plots  to  superimpose  a 
second  series  of  treatments.  This  type  of  design,  called  the  split-plot  experiment, 
was  first  proposed  by  Yates  (1933.*  1935)-  It  is  particularly  useful  in  spacing 
tests  -with  crop  plants,  some  fertilizer  trials,  and  in  cultural  studies.  Le  Clerg 
(1937)  used  this  type  of  experimental  design  to  ascertain  the  effect  of  5  fertilizer 
mixtures  (main  treatments)  on  the  seedling  stand  in  plots  sown  with  treated  and  un- 
treated seed  (sub -treatments) .  Goulden  (1939)  gives  a  more  complicated  split-plot 
design  in  which  he  studied  the  incidence  of  root-rot  on  wheat  varieties,  kinds  of 
dust  for  seed  treatment,  method  of  dust  application,  and  efficacy  of  soil- inoculation 
with  the  root  rot  organism. 

The  split -plot  design  provides  a  more  critical  comparison  of  the  sub-plot  treatments 
than  it  does  for  the  whole-plot  units.  This  is  due  to  the  larger  number  of  replica- 
tions of  the  small  units  which,  in  turn,  provide  a  larger  number  of  degrees  of  free- 
dom for  error.  Paterson  (1939)  advises  that  the  less  important  treatment  effects 
be  allocated  to  the  whole  plots  and  the  more  important  treatment  effects  to  the  sub- 
plots in  order  to  obtain  the  maximum  precision  where  it  is  most  desired. 

The  split-plot  design  leads  to  two  or  more  errors.  To  simplify  the  computation,  all 
treatment  values  should  be  expressed  in  sub-plot  units. 

II.  Data  usca  for  Computation 

Two  designs  are  outlined  below  together  with  the  method  of  computation.  These  data 
are  from  a  corn  uniformity  trial  conducted  at  Waseca  (Minnesota)  in  1933  by  C.  \J . 
Doxtator.  They  are  for  yield  in  pounds  for  the  central  two  rows  of  four-row  plots 
12  hills  long.  For  purposes  of  calculation,  it  was  supposed  that  these  data  were  ob- 
tained from  10  hybrids  which  are  designated  1,2,3 10.  It  was  supposed  further 

that  these  varietal  plots  had  been  split  into  three  parts  to  test  the  yield  of  those 
crosses  obtained  from  F^,  F2,  and  F*  generation  seed.  These  are  designated  a,b,  and 
c,  respectively.  The  yields  in  the  tables  that  follow  are  in  the  same  order  as  in 
the  field.  The  hypothetical  hybrids  and  generations  were  superimposed  on  the  data 
by  random  arrangement ,  .  '" 

III.  Sub -treatments  Randomized  within  Main  Plot  (Plan  A) 

The  field  arrangement  of  the  plots  is  given  below.  The  10  hybrids  are  assumed  to 
have  been  planted  in  rows  of  36  hills,  using  F-j  seed  for  12  hills,  F2  seed  for  12 
hills  and  F^  seed  for  the  remaining  12  hills.  The  order  of  the  hybrids  in  the  field 
is  random  and  the  three  generations  of  seed  for  each  hybrid  are  planted  in  a  random 
order  within  each  long. row. 


Ha 

rbrid  Number 

3 

n 

0 

2 

1  . 

6 

7 

10 

9 

k 

5 

0 

c 

a' 

a 

c 

b 

c 

b 

c 

b 

c 

a 

b 

c 

b 

a 

b 

c 

a 

a 

b 

b 

c 

b  . 

a 

c 

a 

a 

b 

c 

i'This  chapter  is  a  modification  of  one  prepared  by  Dr.  F.  P.  Immer,  University  of 
Minnesota,  for  his  Applied  Statistics  course. 
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The.  yields  of  each  plot  are  given  "below  in  table  1.  Data  from  two  "blocks  are  used 
to  illustrate  the  calculations. 

Table  1.  Yield  of  corn  per  12  hill  plot  and  sums  of  yield  of  36  hill  plots. 

Block  I 


Hybrid  Number 

Total 

3 

8 

2 

1 

6              7 

10 

9 

>± 

_JL. 

a 
48 

c 
i+6 

a 
1+6 

a 
1+2 

c               b 
43           )+7 

c 
1+8 

b 

46 

c 
1+6 

h 
1+9 

c 

a 

b 

c 

b 

a 

b 

c 

a 

a 

1+6 

^5 

1+1+ 

1+6 

45 

49 

45 

48 

43 

49 

b 

b 

c 

b 

a 

0 

a 

a 

b 

c 

^3 

42 

1+2 

44 

1+1+ 

47 

45 

47 

47 

48 

T.  137 

133 

132 

132 

132 

11+3 

138 

141 

141 

146     1373 

b: 

.ock  II 

Hybri 

.d  Number 

Total 

4 

3 

9 

.? 

1 

7' 

2 

0 ' 
0 

0 

10 

c 

a 

a 

b 

b 

c 

c 

a 

b 

c 

1+6 

45 

46 

1+5 

*3 

48 

44 

44 

hi 

1+3 

a 

b 

c 

a 

c 

b 

0 

c 

a 

b 

1+8 

1+1+ 

U6 

1+5 

50 

48 

46 

48 

^3 

b 

c 

b 

c 

a 

a 

b 

b 

c 

a 

1+2 

1+2 

It  4 

^3 

1+4 

48 

47 

46 

44 

42 

T.136   13.1 


136 


133 


137 


147 


136 


139 


,28  1362 


( a )  Calculation  of  Sums  of  S qiiar es 

The  analysis  of  variance  is  given  in  table  2 


Table  2„  Analysis  of  Variance 


Variation  due  to: 


Blocks 
Hybrids 
Error  (a) 


D.F. 


1 
9 
9 


oums  of  Squares 


2.8166 

77.6833 

8I.OI67 


Mean  Square 

2.8166 
8.63I5 
9.0019 


Plots  of  hybrids 


:i9) 


I6I.5166 

7.2533 
88.7667 

40.6667 
298.I833 


Generations 

Hybrids  x  generations 

Error   (b) 

Total 


2 

13 
20 


59 


3.6167 
4.9315 
2.0333 


The  total  sum  of  squares  is  calculated  from  the  squares  of  the  60  individual  plot- 
yields  as  S(x2)  -  (Sx)2/N  which,  numerically  is  125,151.0000  -  124,852.81.67  = 
298.1833  (59  D.F.). 

The  sum  of  squares  for  blocks  is  I3752  +  I5622   -  124, 8o2.8lo7  =  2t8l66  (1  D.F.) 

30 
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Sum  of  squares  for  total  plots  of  hybrids  is  calculated  from  the  marginal  totals  for 
hybrids  in  the  above  table.     Thus, 

1572  +     1352  j     1282      -12i<-,852.8l67     =     161.5166  (19  D.F.) 

3 

To  obtain  the  sums  of  squares  for  hybrids,  generations  and.  the  interaction  between 
them  it  is  necessary  to  set  up  another  table  with  the  yields  of  the  two  replicates 
of  each  treatment  combined. 


Generation: 

Hybrid  Number 

Sum 

1 

2 

3 

if 

5 

6 

7 

8 

9 

10 

a 

86 

9± 

93 

96 

9^ 

92 

97 

89 

93 

87 

921 

b 

87 

91 

87 

89 

9^ 

92 

98 

88 

90 

88 

90^ 

c 

96 

86 

88 

92 

91 

87 

95 

92 

9k 

91 

912 

Sum     269    271    268    277    279    271    290    269    277   266  2757 

Sum  of  squares  for  hybrids  will  be:        . 

2692  ±     2712  + 2662   -12it,852.8l67  =  77.6833  (9  D.F.) 

6 

Sum  of  squares  for  generations  is  obtained  from 

92 12  +     9042  +  9122   _  I2i+, 852. 8167     =     7.2333  (2  D.F.) 
20 

The  sum  of  squares  of  the  30  yields  in  the  above  table  will  be  equal  to  86     -^  9^     * 
-- — 912   -  12i+,852.8l67     or  173.6833  (29  D.F.)  2    ' 

Sum  of  squares  for  interaction  of  hybrids  %  generations  will  be: 

173.6833  (29  D.F.) 

-7.2333  (  2  D.F.)  for  generations 
-77-6333  (  9. D.F.)  for  hybrids 

88.7667  (18  D.F.)  =  sum  of  squares  for  interaction 

(b)  Errors  to  Test  Significance 

These  sums  of  squares  are  brought  together  in  table  2.  The  sum  of  squares 
for  error  (a)  are  obtained  by  subtracting  the  sums  of  squares  for  blocks  (1  D.F.)  and 
hybrids  (9  D.F.)  from  "plots  of  hybrids"  (19  D.F..).  The  sum  of  squares  for  error 
(b)  is  obtained  by  subtracting  "plots  of  hybrids",  generations  and  hybrids  x  genera- 
tions from  the  total. 

Error. (a)  is  an  ordinary  randomized  block  error  and  may  be  used  to  test  the 
significance  of  differences  between  hybrids. 

Error  (b)  is  obtained  from  the  sum  of  the  interactions  between  generations  and 
blocks  within  hybrids.  Thus,  a  table  could  be  arranged  for  the  data  from  hybrid  No. 
3  (see  table  1)  as  follows: 


Block  I 
Block  II 


Generation 

a 

b 

c 

lf8 
k1? 

^3 
kk 

k6 
k2 

21k 

The  interaction  of  blocks  x  generations,  for  2  D.F.,  could  "be  used  as  error  for  this 
simple  comparison.  However,  a  table  similar  to  the  above  could  be  set  up  for  each 
hybrid.  There  -would  be,  then,  10  x  2  =  20  degrees  of  freedom  for  error.  This  is 
what  is  used  as  error  (b) .  The  mean  square  for  error  (b)  is,  then,  the  average 
error  of  blocks  x  generations  wibhin  hybrids .  It  will  be  the  legitimate  error  to 
use  for  comparing  differences  between  generations  and  testing  the  interaction  of 
hybrids  x  generations.  In  practice  this  sum  of  squares  is  obtained  by  subtraction. 

IV .  Sub -treatments  in  Str dps  Across  Blocks  ( Plan  B ) 

The  same  yield  figures  are  used  in  this  plan  as  in  Plan  A.  The  location  of  the 
hybrids  is  also  the  same.   Instead  of  randomizing  the  generations  within  the1  plots 
for  each  hybrid  as  in  Plan  A,  the  generations  are  now  considered  planted  in  long 
strips  crosswise  of  the  entire  block.  However,  randomization  of  the  generations  in 
the  different  blocks  is  used.  The  field  plan  is  given  below. 


Table  3. 

Yields 

of  corn 

plots 

and  the 

field 

arrangeme 

nt  of 

these 

plots. 

Block  I 

Hybrid 

Numbt 

iV 

Total 

3 

B 

2 

1 

_  _£.     .... 

7 

10 

Q 

k 

c. 

a 
1*8 

a 
1*6 

a 
1*6 

a 
1+2 

a 
h5 

a 
kl 

a 
kQ 

a 
k6 

a 
J+6 

a 
i*9 

I+61 

c 
1*6 

c 

1*5 

c 
kk 

c 

1*6 

c 
1*5 

c 
kg 

1  s 

c 
1*5 

0 
kQ 

c 
1*3 

c 

1*9 

h63 

b 
43 

b 
1*2 

b 
1+2 

b 
kk 

b 
kk 

b 
^7 

b 

b 
kl 

b 

b 
1+8 

kk') 

Tot.137    133   132   132   132   lk$         138   l'+l   li+l    ii*6 


Block  II 

Hybrid  Number 

Total 

k 

3 

9 

5 

1              7                2              8              6 

10 

b  b              b              b             b              b  b              b              b  b 

1*6  1*5          1*6          45          1*3          1*8  li-1*          1*1+          1+7  k'5           1*51 

c  ccccc  ccc  c 

1*8  44          46          1*5           50'         51  1*8          46          1*8  1+3            469 

a  a             a             a           •  a             a  a             a             a  a 

1+2  k2            kk            1+5            1*4            1*8  1*7            1+6            1*1*  1*2              1*1*2 


T0U.36     131    136    I33    137    l)+7     139    136    139     128     1362 

The  some  plots  are  used  here  as  in  table  1.  The  hybrids  occur  ;ln  the  same  order  as 
in  the  previous  table,  the  only  difference  being  the  arrangement  of  the  generations. 
In  table  3  the  generations  occur  in  strips  crosswise  of  the  blocks. 
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Table  4.     Analysis 

of  variance  from  the  data 

of 

table 

3 

Key  number  for 
D  ,F .   and  sums  of 
squares 

Variation,  due 
•  to 

D.F. 

Sum  of 
Square  s 

Mean  Square 

1 
2 
3  =  4-1-2      . 

E locks 
Hybrids 
Error  (a) 

1 
9 
9 

2.8166 

77-6833 
81.OI67 

2.8166 
8.6315 
9.0019 

=  7-1-5 


Plots  of  hybrids 

Blocks 
Generat  ions 
Error  (b) 


19 


161.5166 


1 

2.8166 

2.8166 

2 
2 

35.4333 

16.2334 

17.7167 
8.1167 

5 

54.4833 

19 
1+ 

161.5166 
51.6667 

18 

18 

61.5667 
23.1+333 

3.4204 

1.3019 

8=5+6 


10  =  11-9-8J* 


Plots  of  "generations" 

Plots  of  hybrids 
Deviation  of  genera-  ) 
tion  plots  from  blocks) 
Hybrids  x  generations 
Error  (c) 


11 


Total 


59 


298.1835 


The  total  sum  of  squares  (298.1835)*  sum  of  squares  for  blocks,  hybrids,  error  (a) 
and  total  plots  of  crosses  will  be  the  same  as  under  Plan  A.  The  position  of  the 
"generation"  plots  has  "been  changed,  however,  and  the  other  sums  of  squares  must  be 
recalculated. 

As  far  as 'the  test  of  the  three  generations  a,  b  and  c  is  concerned  there  are  but  . 
six  plots  as  given  in  the  marginal  total  of  table  3.  The  sum  of  squares  for  those 
six  "plots  of  generations"  will  be 

l+6l2  4--I.652  ±  1+1+ 92  +  4512  ±  4692  ±  4422  -  12l+ ,  852 .  8167  =  54.4833(5  D.F.) 

10 

To  obtain  the  sum  of  squares  for  generations  and  for  interaction,  a  table  is  made  up 
by  combining  the  two  yields  of  each  treatment . 


Genera- 
tion 


IT 


Hybrid  Number 


5 


~8~ 


"b 
c 


86 

87 
96 


93 
86 
92 


90 
88 
90 


8b 

93 

96 


92 

93 

9l+ 


Tot. 


269 


271 


263   277 


070 


79 


271 


290 


269    277 


10  Tot. 


87 

95 

92 

90 

90 

903 

91 

95 

86 

93 

83 

900 

93 

100 

91 

91+ 

88 

934 

266  2737 


The  sum  of  squares  for  generations  will  be 
35-4333  (2  D.F.) 


9002 


9542   .  12^,852.8167  - 


20 


The  yield  figures  in  the  above  table  are  squared,  i.e.,  So2  +  93^  +  88  .  The 

sum  is  divided  by  2,  the  correction  factor  then  being  subtracted.  This  gives 
174.6833  as  the  sum  of  squares  for  bhese  29  degrees  of  freedom.  The  sum  of  squares 
for  the  interaction,  hybrids  x  generations  will  he:  I7I+.6833  -  77.6833  -  35.4333  = 
61.5667  (18  D.F.). 
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In  table  k   it  is  noted  that  the  comparison  of  hybrids  is  the  same  as  under  Plan  A. 
Error  (b)  will  he  obtained  by  the  subtraction  of  blocks  and  generations  from  the 
"plots  of  generations."  It  is  seen  from  table  3  that  the  analysis  of  variance  to 
test  the  significance  of  generations  involves  only  6  large  plots.  The  total  yields 
of  these  could  be  set  down  from  the  marginal  totals  of  table  3  as: 


Generations 

a 

b        c 

Total 

Block  I  k6l 
Block  II  hh2 

M+ 9      1+65 
k'jl             469 

1375 
1362 

Total    903      900      93^        2737 

An  analysis  of  variance  of  this  2-by-3  table  would  give  the  second  section  in  the 
complete  anal.ysis  of  table  h.     Error  (b)  is  appropriate  for  the  comparison  of  dif- 
ferences between  generations. 

Error  (c)  is  obtained  by  subtraction  from  the  total  the  items  listed  in  table  k 
opposite  error  (c).  Error  (c)  is  the  second  order  interaction  of  blocks  x  hybrids 
x  generations.,  the  degrees  of  freedom  being  1  by  9  by  2  =  18.  It  was  obtained  in 
table  k   by  subtraction  but  has  the  above  meaning.  Error  (c)  is  appropriate  for 
testing  the  significance  of  the  interaction  of  hybrids  x  generations. 

Since  these  were  uniformity  trial  data  no  attempt  will  be  made  to  determine  signifi- 
cance of  the  different  mean  squares.   In  a  practical  experiment  these  tests  are 
carried  out  in  the  ordinary  way,  the  appropriate  errors  given  in  the  tables  being 
used . 

Yates  (1933)  has  discussed  the  above  two  designs  rather  fully. 

Y.  Comparison  of  Two  Designs 

Suppose  the  10  hybrids  and  3  generations  of  the  seed  of  each  (F]_ .,  Fg  and  F*)  had 
been  considered  as  simply  30  treatments  and  completely  randomized  within  the  blocks 
without  reference  to  split  plot  arrangements.  The  analysis  of  the  data  would  have 
taken  the  form: 

Variation  due  to:  D.F. 


Blocks  1 

Hybrids  9 

Generations  2 

Hybrids  x  generations  18 

Error  29 


Total  59 

The  degrees  of  freedom  for  error  given  above  (29)  is  equal  to  the  sum  of  the  degrees 
of  freedom  for  errors  (a)  and  (b)  under  Plan  A  and  the  sum  of  degrees  of  freedom  for 
errors  (a),  (b)  and  (c)  under  Plan  B. 

Plan  3  is  the  same  as  Plan  A  insofar  as  precision  of  tests  of  the  hybrids  is  con- 
cerned.  It  differs  from  Plan  A  in  that  precision  for  the  comparison  of  generations 
is  sacrificed  in  order  to  obtain  greater  precision  for  the  interaction. 

The  design  of  an  experiment  will  depend  entirely  on  what  element  of  the  treatments 
the  highest  degree  of  precision  is  desired,  When  the  primary  emphasis  is  to  be 
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placed  on  the  Interactions,  at  the  expense  of  higher  errors  for  the  main  'effects, 
Plan  B  Is  to  be  preferred.  When  the  main  effects- are  of  major  interest  either  the 
complete  randomized  block  or  Plan  A  are  to  be  preferred. 

In  practice  the  relative  differences  in  magnitude  of  the  different  errors  under 
Plans  A  and  B  will  depend  on  the  dimensions  of  the  blocks .  In  this  case  the  blocks 
•were  1+0  rows  wide,  or  lAO  feet,  and  the  36  hill  rows  of  hybrids  were  126  feet;  long. 
Consequently  the  "generation"  plots  tended  to  be  closer  together  than  the  most  dis- 
tant hybrids,  in  the  same  block. 

Plan  A  is  particularly  applicable  to  studies  of  space  relationship  between  plants  in 
relation  to  yield.  In  a  recent  study  of  the  effect  of  spacing  on  yield  of  soybeans, 
conducted^by  the  Division  of  Agronomy  and  Plant  Genetics,  U.  of  Minnesota,  Plan  A 
was  found  00  be  admirably  suited  to  the  test.  The  soybeans  were  planted  in  k   row 
plots,  the  rows  being  16,  20,  2k,   28,  32,  and  ho   inches  apart.  Then,  the  soybeans 
were  planted  at  k   different  rates  within  each  spacing,  being  l/2,  1,  2,  and  3  inches 
apart  within  rows .  The  only  oa.sy  way  to  lay  out  such  a  test  was  to  plant  the  plots 
of  different  width  rows  crosswise  of  the  regular  132-foot  series.  The  k   different 
rates  of  seeding  were  then  randomized  within  these  long  rows,  the  ultimate  plots, 
being  33  feet  long. 

Plan  A  could  be  laid  out  as  follows  also,  using  the  same  notation  as  employed  in 
table  1. 


Hybrid  H umber 

3 

• 

8                  : 

2 

:   etc . 

a 

:     c 

:     b      : 

c      :     a     :     b     : 

b 

:      a     : 

c 

:   etc. 

Here  the  hybrids  are  planted  in  groups  of  3  plots  with  the  three  generations  in  a 
random  arrangement  within  each  hybrid  plot  but  they  occur  side  by  side  instead  of 
end  to  end.  By  this  plan  it  is  obvious  that  the  comparisons  between  generations 

(a,  b,  and  c)  will  have  a  lower  error  than  comparisons  between  hybrids  (1,2,3  ). 

The  data  from  this  arrangement  would  be  analyzed  exactly  in  the  same  manner  as  given 
under  Plan  A. 

VI.  Randomized-Block  vs.  Split-Plot  Experiments 

The  relative  efficiency  of  randomized-block  and  split-plot  experiments  was  studied 
on  uniformity  trial  data  with  sugar  beets  by  Le  Clerg  (1937)  both  in  the  field  and 
in  the  greenhouse.  He  compared  the  magnitude  of  the  variance  of  the  sub-plots  with- 
in main  plots  in  the  split-plot  design  with  the  variance  of  sub-plots  within  blocks 
in  the  randomized  block  arrangement.  The  variance  for  sub-plots  within  main  plots 
in  the  split -plot  design  was  markedly  less  than  that  for  sub-plots  within  blocks  in 
the  randomized  arrangement.  The  split-plot  design  was  71  per  cent  more  efficient 
in  one  set  of  uniformity  data  and  53  per  cent  more  efficient  In  another.  For  com- 
parisons of  main  plots  within  blocks  there  was  a  decrease  in  efficiency  by  the  use 
of  the  split-plot  arrangement.  Similar  results  were  obtained  for  greenhouse  trials, 
altho  less  marked. 
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Questions  for jD is cuss ion 

1.  What  is  a  split -plot  design?  Where  used  to  advantage?  List  at  least  3  situa- 
tions . 

2.  Explain  the  differences  in  field  lay-outs  that  lead  to  two  and  three  errors. 

3.  Under  what  conditions  would  you  use  Plan  "A"?  Plan  "B"? 

4.  Compare  the  relative  efficiency  of  split-plot  and  randomized-block  designs 
super -imposed  on  uniformity  trial  data. 


li'roplems 

The  following  data  are  from  a  randomized  "block  experiment  with  "split  plots''  designed 
to  test  the  differences  in  yield  of  soybeans  planted  at  different  spacings  between 
and  within  rows.  Four  row  plots  were  used,  one  row  being  harvested  for  hay  and  one 
for  seed. 


(A)  Yield  of  soybeans  in  "bushels  per  acre 


Block         Width  Block 

No.               of  rows  1/2"  1"              2"  3"  Total  Total 

I                  16"  25.1  21.3  22.3  22.1  Q0.8 

20"  21.8  22.7  22.2  22.8  89.3 

24"  21.9  21.8  21.2  20.6  8.3.5 

28"  21.2  20.  4  20.  k  17.9  79.9 

32"  20.7  20.0'  16.3  20.0  79.0 

ko^ 19.3 18.3 i?V3 16.3 JIJS ^v^o 

II                  16"  25.2  I9.9  22.1  22.7  89.9 

20"  21.9  21.3  22.1  22.9  88.2 

24"  19.7  I9.8  20.1  19.8  79.4 

28"  20.8  21.2  18.8  20.6  81.1+ 

32"  18.3  20.7  17,5  16. 4  73.1 

4o"  18.3 18.2  19.8 15  .j2 72.4 484.4 

III                  16"  15.7  21.6  22.9  "  20.3  80.5 

20"  22.0  20.4  22. 4  20.7  83-5 

24"  23.5  20.7  20.7  20'.  5  87.4 

28"  21.5  19.9  20.3  20.9  82.8 

32"  22.0  '          19.3  13.1  17.8  77-2 

4o^ 20_!3 16.4 17  ^3 13.3 72-9 486.3 

IV                16"  '  23.8  29.0  12.3  23.5  88.6 

20"          ■       27.0  21.2  20.5  20.7  89.4 

24"  23.3  20.0  22.5  19.8  33.6 

28"  22.3  21.3  22,7  13.9  83.6 

32"  23.9  18,4  20.7  18.7  31.7 

4o» 19.9 17.8  16.9 18.3 73  - 1 -04,0 

Total  322.6  491.8  479.8  476.8  1971.0                       1971.0 
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(B)  Yield  of  dry  hay  In  tons  per  acre 


Block 

Width 

Block 

No. 

of  rows 

1/2" 

1" 

2" 

3" 

Total 

Total 

I 

16" 

2,91 

2.59 

2.41 

2.7^ 

10.65 

20" 

2.96 

2.35 

2.31 

2.10 

9.72 

24" 

2.3^ 

2.30 

2.21 

2.23 

9.08 

28" 

2.59 

2.47 

2.16 

2.10 

9.32 

32" 

2.21 

2.12 

2.05 

1.90 

8.28 

40" 

2.24 

1.90 

1.82 

1.79 

7^75 

54.80 

II 

16" 

2.85 

2.42 

2.45 

2.31 

10.03 

20" 

2.42 

2.48 

2.31 

2.27 

9.48 

24" 

2.40 

2.19 

2.29 

2.08 

8.96 

28" 

2.48 

2.22 

2.30 

2.08 

9-08 

32" 

2.32 

2.10 

2.23 

2.06 

8.71 

40" 

2.34 

2.07 

1.76 

1.78 

7.95 

54.21 

III 

16" 

2.81 

2.61 

2.65 

2.25 

10.32 

20" 

2.66 

2.52 

2.78 

2.52 

10.48 

24" 

2.57 

2.41 

2.28 

2.15 

9-39 

28" 

2.03 

2.22 

2.39 

2.01 

8.65 

32" 

2.68 

2.21 

1.97 

1.96 

8.82 

40" 

2.13 

2.09 

1.84 

1.96 

8.02 

55-68 

IV 

16" 

2.83 

3.10 

2.12 

2.38 

10.43 

20" 

3.27 

2.71 

2.33 

2.42 

10.73 

24" 

2.71 

2.31 

2.22 

1.97 

9.21 

28" 

2.52 

2.53 

2.24 

2.09 

9.38 

32" 

2.37 

2.29 

2.20 

1.85 

8.71 

4o" 

2.10 

2.13 

1.92 

2.06 

8.21 

56.67 

Total        60.74  56.34        53.24        51.04  221.36  221.36 

The  actual  field  arrangement  of  plots  in  this  experiment,    in  block  number  III 
was  as  follows:     The  plot  arrangement  in  the  other  "blocks  was  randomized  in  a  similar 
manner. 

Width  of  rows 
-  16" 
32" 
28" 
40" 
24" 
20"  3"  1"  1/2"  2" 

1.  Analyze  the  data  on  yields  of  soybeans  in  bu.  per  acre. 

(a)  Calculate  the  complete  analysis  of  variance.  Test  the  significance  of  the 
different  mean  squares,  compared  with  the  appropriate  error  variances,  by 
means  of  the  F  test. 

(b)  Determine  the  significance  of  the  difference  between  20"  and  32"  rows  by  means 
of  the  standard  error. 


Spacing 

wi 

thin 

rows: 

1/2" 

1" 

3" 

2" 

1" 

3" 

2" 

1/2" 

y 

1/2" 

a. 

2" 

2" 

1" 

1/2" 

3" 

1" 

3" 

2" 

1/2" 

220 

(c)  Determine  the  significance  of  the  difference  "between  1/2"  and.  2"  spacings  by 
means  of  the  standard  error. 

2.  Analyze  the  data  on  yields  of  soybeans  for  tons  of  dry  hay  per  acre  in  a  similar 
manner. 

3-  Key  out  the  degrees  of  freedom  for  a  split  plot  experiment  (two  errors)  for  3 
spacings,  4  blocks,  and  3  widths  of  rows. 

4.  A  split-plot  experiment  was  designed  to  determine  the  effect  of  seed  treatment  on 
the  stand  and  ultimate  yield  of  dryland  corn  planted  at  3  different  dates.,  a,,    b, 
c.   Each  plot  consisted  of  2  sub-plots.,  the  seed  being  treated  (l)  with  an  organic 
mercury  compound  in  one -half,,  and  untreated  (U)  in  the  other  half.   There  were  3 
me/in  plots  in  each  block.  All  treatments  were  randomized.  The  field  design  of 
Block  I  was  as  follows: 


! 

T    '  U 
t 

T 

T    '   U 
t 

U    '    T 
i 

b 

c 

a 
i 

The  yield  data  for  the  6  blocks  of  the  experiment  were  as  fallows  in  bushels  per  acre: 


Date 

Seed 

Bl 

ock 

Planted 

Treatmi 

snt 

1 

2 

•2 

h 

3 

6 

Total 

a 

U 

2.3 

4.6 

3.4 

2.3 

5-8 

3-3 

24 . 2 

T 

2.3 

4.7 

4.2 

3.6 

5.0 

4.6 

24.9 

b 

U 

h.3 

M 

3-3 

6.1 

4.5 

4.0 

26 . 5 

T 

5.1 

6.1 

3-1 

4.3 

5.3 

5.9 

29.8 

c 

U 

2.7 

1.4 

2.3 

3-8 

2.9 

5-9 

17.0 

T 

2 . 0 

.1.3 

1.3 

fc-7 

3.4 

1-5 

15.2 

(a)  Calculate  the  complete  analysis  of  variance. 

(b)  Determine  the  significance  between  treated  and  untreated  seed,  and  also  between 
planting  dates. 


CHAPTER  Xn 
CONFOUNDING   IN  FACTORIAL  EXPERIMENTS 

I.  Factorial  Experiments 

The  randomized-block  and  Latin-square  designs  are  widely  used  in  field  experiments, 
"both  "being  very  efficient  for  simple  studies.  However,  there  are  situations  in  ex- 
perimentation where  a  large  number  of  varieties  or  treatments  are  to  be  compared  at 
two  or  more  levels.  The  factorial  experiment  is  useful  in  such  situations.  Suppose 
that  three  fertilizers,  Nitrogen  (N),  Phosphorus  (P),  and  Potash  ,(K)  are  to  he  test- 
ed at  two  or  more  levels.  The  classical  method  of  approach  would  he  to  vary  the  two 
levels  for  each  element  only  one  at  a  time,  i.e.,  the  investigator  would  set  up 
separate  experiments  to  test  each  element  alone  at  its  respective  level.  The  single 
factor  could  then  he  studied  under  controlled  conditions  at  each  of  the  two  levels. 
To  test  these  factors  simultaneously  in  the  same  experiment,  would  permit  one  to 
study  the  effects  of  different  amounts  of  one  fertilizer  on  the  others  in  all  combi- 
nations. Thus,  a  wider  "base  of  inductive  reasoning  is  provided.  The  experimental 
argument  is  also  strengthened  by  the  larger  total  number  of  plots  in  the  test.  (See 
Fisher,  1955). 

Goulden  (1937)  describes  a  factorial  experiment  as  one  made  to  study  simultaneously 
various  treatment  factors.  Thus,  an  experiment  designed  to  study  at  the  same  time 
rate  and  depth  of  seeding  of  a  cereal  crop  would  be  a  factorial  experiment  in  which 
two  factors,  rate  and  depth,  are  represented  at  two  or  more  levels.  The  study  of 
interactions  is  an  important  consideration  in  such  an  experiment.  The  introduction 
of  factors  is  limited  by  space  and  cost  of  experimentation. 

Suppose  a  fertilizer  test  is  to  be  conducted  \#ith  II,  P,  and  K  at  two  different  rates 
each.  The  rates  can  be  designated  by  subscripts  so  as  to  give  the  eight  possible- 
treatment  variants  as  follows:  •*- 

NqPoKo,  NiPqKo,  NoPiKo,  NqPoKi,  N1P3.K0,  NiPqKi,  NqPiKi,  and  NiP]Ki 

The  degrees  of  freedom,  i.e.,  the  number  of  comparisons  free  to  vary,  may  be  keyed 
out  as  follows: 


Variation  due  to 


Degrees  freedom 


Bemarks 


Nitrogen  (N) 
Phosphorus  (P) 
Potassium  (K) 

N  x  P2 
N  x  K 
P  x  K 

N  X  P  x  K 


1   ) 

1   ) 

.      1   ) 

1  ) 
1  ) 
1   ) 

Li 


Main  effects 


First  order  interactions 


Second  order  interactions 


Total 


■'•Note:  The  subscripts  0  and  1  represent  the  two  fertilizer  levels. 

The  symbol  (x)  denotes  interaction  and  not  a  variable  as  heretofore. 
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II .  Data  for  Computation  of  Factorial  Experiment 

The  computations  for  the  Analysis  of  Variance  for  such  a  factorial  experiment  will 
he  illustrated  with  uniformity  trial  data.  Four  complete  replications  will  he  used. 
The  uniformity  data  on  crested  wheatgrass  were  furnished  by  Dr.  B.  M.  Weihing.  The 
plots  are  combined  as  8-l|-ow  plots,  16  feet  long,  with  rows  6  inches  apart.  Thus, 
each  plot  is  k   hy  1.6  feet  in  size.  The  yields  are  given  in  grams  of  air-dry  field 
cured  hay.  The  uniformity  trial  data  follow. 

Tahle  1.  Uniformity  Data  for  Crested  Wheatgrass 


Blocks 


Plot  No.  I  II  III  IV 


(gnu)  (gm.)  (gnu)  (gnu) 

1  5135  3175  ^05  3750 

2  I4-725  3980  1+575  3920 

3  1+600  W+20  3910  1+175 
•>+  1+955  I+580  ^065  3280 

5  3210  3970  3510         .   3190 

6  3670  1+255  ]+305  3573 

7  3735  3665  3993  3530 

8  3965  1+315  I+030  2900 


III.  Computation  as  Simple  Randomised  Block  Experiment 

The  eight  treatments  will  first  he  superimposed  on  the  crested  wheatgrass  yield  data 
for  a  randomized  hlock  test.  - 

Tahle  2.     Yields   of  Crested  Vlieat grass   in  Eandoialzod  Blocks 


Tre, 

atmont 

Eeplication 

0 

N 

F 

K 

HP             W£ 

PK 

NPK 

Totals 

I 

II 

III 

IV 

3210 
'3970 
1+305 
3530 

U955 

3175 
1+1+05 

^175 

5135 
1+1+20 
1+573 
3920 

1+600 
3980 
3910 
3280 

5963       3733 
i+255       3665 
1+030       3510 
2900      3575 

3670 
^-515 
3995 
3190 

1+725 
1+580 

I4O65 
3750 

3I+OI+5 

32560 

32795 
28320 

Totals 

15015 

16710 

18050 

15770 

15150     1.1+535 

15170 

17120 

I27320 

The  sums  of  squares  for  "blocks,  treatments,  total,  and  error  are  computed  in  the 
ordinary  manner.  The  Analysis  of  Variance  can  hu  summarized  as  follows: 


Variation 

D.F. 

Sum 
Squares 

Mean 

Square 

"]?" 

Value 

due  to 

Oh served 

5  Pet .  Point 

Blocks  3  2,303,556  767,832  3.91  3-07 

Treatments  7  2,628,237  375,1+62  1.91  2.57 

Error 21  k,  125,107 12^^'ik , 

Total 31  ~    '9,056",  900   "' 

The  hlock  effect  removed  is  just  enough  to  he  statistically  significant.  Treatment 
effects  are  within 'the  limits  of  error  since  the  data  are  from,  a  uniformity  trial. 


■Htfote:  The  same  randomization  for  treatments  is  used  here  as  in  the  confounded  experi- 
ment to  he  mentioned  later. 
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The  crested  wheat  yield  data  will  now  he  considered  from  the  standpoint  of  confound- 
ing. This  process  is  expected  to  accomplish  several  things:  (1)  A  greater  amount 
of  the  variability  due  to  soil  heterogeneity  should  he  removed  because  more  and 
smaller  blocks  will  be  used;  (2)  A  chance  to  examine  the  second  order  interaction, 
N  x  P  x  K,  will  be  forfeited;  and  (3)  The  reduction  of  experimental  error  in  this 
manner  should  sharpen  all  treatment  and  interaction  comparisons. 

IV.  Confounding  in  a  2  by  2  by  2  Experiment1 

A  few  terms  must  first  be  made  clear  before  the  analyses  are  made. 

(a)  Explanation  of  Terms 

Every  effort  is  made  to  maintain  orthogonality  in  an  experiment.  Yates 
(1933)  defines  orthogonality  as  follows:   "Orthogonality  is  that  property  of  the  de- 
sign which  ensures  that  the  different  classes  of  effects  shall  be  capable  of  direct 
and  separate  estimation  without  any  entanglements."  Thus,  orthogonality  is  ensured 
in  a  randomized  block  experiment  by  the  very  nature  of  the  design,  i.e.,  each  block 
contains  the  same  kind  and  number  of  treatments.  Non- orthogonality  is  introduced 
when  some  of  the  plots  in  one  or  more  of  the  blocks  are  lost .  Special  methods  may 
be  required  to  separate  treatment  and  block  effects. 

Non-orthogonality  is  sometimes  deliberately  introduced  in  factorial  experiments  that 
involve  a  fairly  large  number  of  combinations .  This  process  is  called  confounding. 
The  purpose  is  to  increase  the  accuracy  of  tie  more  important  comparisons  at  the 
expense  of  the  comparisons  of  lesser  importance. 

(b)  Confounding  the  Second -order  Interaction 

The  second  order  interaction  (N  x  P  x  K)  in  this  experiment  may  be  considered 
the  least  important.  Certainly,  it  would  be  difficult  to  interpret  in  terms  of  fer- 
tilizer practice,  even  though  it  were  significant.  Suppose  it  is  desired  to  con- 
found the  one  degree  of  freedom  for  this  interaction  with  blocks.  To  accomplish 
this,  it  is  necessary  to  determine  the  distribution  of  treatments  in  the  blocks  in  a 
manner  so  as  to  confound  this  one  treatment  and,  at  the  same  time,  leave  the  others 
intact . 

Algebraically,  the  treatment  effects  can  be  represented  as  follows: 


Nitrogen  (N)      = 

(*1 

"  N0)  (Px 

+  P0)(K1   +  K0) 

Phosphorus   (P)  = 

<*1 

-*oN»i 

+  N0)(%   +K0) 

Potassium  (K)  = 

(Ki 

-  Ko)(% 

+  N0)(P1  +P0) 

N  x  P 

(»1 

-  *0)(P1 

-  P0)(KX  +K0> 

N  x  K 

(>1 

-  N0) (% 

-  K0)(P1  +-P0) 

P  x  K 

(Pi 

-?o)(Kl 

-  KQ)(N1  +  NQ) 

N  x  P  x  K 

(Hi 

-  HjfcPi 

-  Pj(Ki    -  O 

The  last  expression  can  be  expanded  as  follows: 

N  x  P  x  K  =  (NX  -  NQ)(P1  -  PjjXKx  -  K0)  = 

(NXP0K0  t  I^A  ♦  *£&  -   %?!%)  -  (XfJb   -  NiP^  +  N^Kj.  +  N^K^  = 

(N  +  P  +  K  +  KPK)  -  (0  +  UP  +  PK  +  NK) 

Then,  the  blocks  could  be  divided  as  follows  so  as  to  confound  the  second  order  in- 
teraction with  block  effect: 


N  P  K  NPK       0   NP   PK   NK 


Sub -block  A       Sub -block  B 


^-For  more  eomplicated  designs  see  Yates  (1933),  Fisher  (1935)  and  Goulden  (1937). 
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The  contracts  "between  the  two  sub -blocks  of  each  replicate  will  "be  contrasts  of  the 
second  order  interaction  (N]i?iK]_  and  N0P0K0)  .  This  interaction  will  have  "been  con- 
founded with  "blocks. 

The  sum  of  squares  for  the  second  order  interaction  will  be  given  by:   (See  Goulden, 
1957). 


1/2  k  ;  (N  +  P  +  K  +  NPK)  -  (0  +  NP  +  PK  +  NK)] 


where  k  -  number  of  plots  represented  in  each  total.  The  above  sum  of  squares  will 
contain  not  only  the  second  order  interaction  effect  but  also  the  block  effect. 

In  this  ca.se,  blocks  of  four  plots  each 'have  been  used  for  error  control  instead  of 
blocks  of  eight  (as  would  be  the  case  in  a  simple  randomized  block  experiment);  and 
only  the  second  order  interaction  has  been  lost.  The  key-out  for  four  complete 
replications  will  be  as  follows: 


Variation  due  to Decrees  of  freedom 

Blocks  7 

N  1 

P  1 

K  1 

IxP  1 

N  x  K  1 

P   X  K  1 


.error 
Total 


1  u 


The  treatments  will  be  randomized  in  each  sub-block.     The  field  arrangement  and  plot 
yields   follow: 

Table   3.     Field  Plan  with  Plot  Locations   and  Yields 


Sub -block  A Sub -block  B 

Replication         Treatment       Yield  Treatment  Yield 


1 

N0Pl  K0 
NlPlKi 

NpPoKl 
NiP0K0 
Total 

NlP0K0 
N0P0Kl 

NoPlKo 

5155 
J+725     ., 

1+600 

191+15 

5175 
3980 
kkZO 
i4-58o 

NqPqKo 

NoPlKi 

KlPcICl 
N1P1K0  ■ 

5210 
3670 
5735 
396^ 

II 

Total 

N0P0K0 

WlPlKo 
NiPqK] 

NoPlKi 

]Jo30 

3970 
1+253 
3665 
^315 

III 

Total 

Nl?oKo 
N0?iK0 
NoPoKl 
N1P1K1 

16155 

41+05 
1,573 

5910 
M-065 

Totai 

NiPoECi 
KqPoKo       . 
NcPlKl 

NlPlK0 

16265 

5510 

H305 

:>9?o 

1+030 

IT 

Total 

UlPlKi 
N-qPxKo 
NlP0K0 
NpPoKi 

16955 

5750 
3920 
^175 
3280 

Total 

N0?lKi 
WlPpKl 
NcPbKo 

N1P1K0 

15840 

3190 
3575 
3530 
2900 

Total 

15123 

Total 

13195 

The  yield  data  are   summarize  for  main  effects   in  Table  h  as  follows 
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Table  4.  Total  Yields  for  Four  Replications  per  Treatment 


Ko 

Kl 

Sum 

Ko 

Kl 

Sum 

No  Po 
?1 

15,015 
18,050 

15,770 
15,170 

50,785 
55,220 

Hi  *o 

16,710 
15,150 

1^,555 
17,120 

31,21+5 
52,270 

Total 

55,065 

50,9^0 

64,005 

Total 

51,860 

51,655 

65,515 

The  yields  for  the  various  interactions  are  totalled  below  in  Table  5: 

Table  5.  Total  Yields  for  Interactions 


Comparison 
(a)  N  and  K 

(po  + 

Pl)No 

Nl 

Ko 

Totals 
Kl 

Sum 

55,g65 
51,860 

50,940 
31,655 

64,005 
65,515 

Total 

64,925 

62,595 

127,520 

(N0  + 

Nl>Po 
*1 

Ko 

Kl 

Sum 

(b)  P  and  K 

51,725 
55,200 

50,505 
52,290 

62,050 
65,490 

Total 

64,925 

62,595 

127,520 

(*o  + 

Nl 

po 

pl 

Sum 

(c)   N  and  P 

50,785 
51,245 

55,220 

52,270 

64,005 
63,515 

Total 

62,050 

65,490 

127,520 

The  sums  of  squares  for  the  experiment  are  given  in  table  6.  The  sum  of  squares  for 
"blocks,  N,  P,  K  and  total  can  be  entered  from  table  6.  The  sum  of  squares  for  N  x  P 
is  obtained  hy  the  subtraction  of  7505  +  374,112  (N  +  P)  from  445,745  which  is  S(x|p). 
The  result  would  be  62,128.  The* sums  of  squares  for  N  x  K  and  for  P  x  K  are  obtained 
in  a  similar  manner. 

Table  6.  Calculation  of  Sum  of  Squares 


Total 

Divide 

Corrected 

Symbol 

Tahle 

Sum  Squares 

by 

(Sx)2/N 

Sum  Squares 

D.F. 

S(x2) 

5 

517,224,100 

1 

508,167,200 

9,056,900 

51 

S(x2) 

5 

515,954,112 

4 

503,167,200 

5,786,912 

7 

S0§) 

4 

8,130,795,250 

l6 

508,167,200 

7,505 

1 

S(x2) 

4 

8,156,661,000 

16 

508,167,200 

374,112. 

'  1 

S(x|) 

4     • 

8,133,589,650 

16 

508,167,200 

169,653   ■ 

1 

S(x2   ) 

NP 

5 

4,068,887,550 

8 

508,167,200 

.443,743 

1 

SK2J 
HE 

s 

4,067,676,450 

8 

503,167,200 

292,356 

1 

S(x2   ) 

PK 

5 

4,069,752,750 

8 

508,167,200 

55.1,895 

1 

The  analysis  of  variance,  together  with  the  obtained  and  theoretical  "F"  values  are 
presented  in  Table  7. 
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Table  7-  Analysis  of  Variance 

Variation 

D.F. 

Sum           Mean 
Squares        Square 

"F"  Value 

due  to 

Obtained   5  pet. 

point 

Blocks  7  5,786,912  826,702  5.87  2.66 

N  1  7;503  7,503  18.76  2}+3.91 

P  1  37^,112  37^,112  2.66  KM 

K  1  169,653  169,653  1.21  KM 

NxP  1  62,128  62,128  2.27  2V3.9I 

NxK  1  115,200  115,200  1.22  2U3.9I 

PxK  1  8,128  8,123  17.32  2U3.9I 

Error 13 2,553,264  L'iQ,757 

Total  31  9,056,900" 


It  is  noted  that  the  mean  square  for  error  has  "been  decreased  materially  in  the  con- 
founded experiment  as  compared,  to  that  in  the  simple  randomized  "block  experiment .  In 
the  former,  the  mean  square  is  1^0,737  while  in  the  latter  it  is  196, K^K .     It  is  also 
to  he  noted  that  more  of  the  variability  due  to  soil  heterogeneity  has  been  removed 
from  the  experimental  error  and  drawn  off  in  block  effect  which  now  appears  as  highly 
significant. 

The  real  value  of  confounding  as  a  means  to  bring  out  more  closely  significant  treat- 
ment effects  and  interactions  is  not  evidenced  in  this  illustration  because  uniform- 
ity data  have  been  employed.  The  confounding  design  is  purely  artificial. 

V.  Partial  Confounding  in  a  2  by  2  by  2  Experiment 

The  above  procedure  resulted  in  the  complete  sacrifice  of  the  second  order  inter- 
action, but  it  may  be  argued  that  the  experimenter  has  taken  too  much  for  granted. 
He  may  overcome  this  difficulty  by  partial  confounding,  i.e.,  confounding  different 
interactions  in  different  replications.  Goulden  (1937)  states  that  the  results  are 
used  from  the  blocks  in  which  the  particular  effects  are  not  confounded  in  order  to 
recover  a  portion  of  the  information  desired.  The  fertilizer  test  used  as  an  example 
can  be  partially  confounded  and  at  the  same  time  recover  a  portion  of  the  information 
on  all  the  comparisons.  Four  replications  will  be  required  for  this  purpose.  In 
each  replication,  one  degree  of  freedom  can  be  confounded  with  blocks  for  one  of  the 
interactions.  There  are  four  interactions,  viz.,  N  x  P,  N  x  K,  P  x  K,  and  N  x  ?  x  K. 

The   algebraic  relations   stated  previously  can  be  used  to  determine  the  treatments  to 
place   in  each  sub-block  to  gain  the  desired  effect. 


Bub -Blocks 


Interaction  Algebraic  Relationship  A 


NxP  =  (N]_    -  N0)(P1    -  P0)(K1    -  K0)  =  (N+P+MK+PE)  -  (0*HP+K+HPK) 

N  x  K  =  {Nil   ~  NoHKl    *  ^(pl   +  ?o)  c  (N+K+KP+PE)  -  ( 0+P+HK+HEK) 

p  x  K  =  (Pi'  -  P0)(Ki   -  *c)(Nl  +  N0)  =  (P4K+IIP+ME)  -  (0+N+PE+NPE) 

N  x  P  x  K  =  (Nl    _  NC)(P1    .  p^(K}    -  K0)  =  (N+P+K+NPK)  -  (O+NP+PK+NK) 


The  treatments  within  each  sub-block  will  be  randomized.     Table  8  gives  the  field 
design  together  with  the  plot  yields   in  grams  for  the   fertilizer  trial  superimposed 
on  crested  wheatgrass  uniformity  trial   data. 
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Table  8.  Field  Arrangement  and  Yields  In  Partially  Confounded  Experiment 


Sub- 

-block  A 

Sub- 

-block  B 

Replication 

Treatment 

Yield  (gm. ) 

Treatment 

Yield  (gm.) 

I 

P 

5135 

0 

3210 

(N  x  P  confounded) 

PK 

i+725 

K 

3670 

NK 

46oo 

NP 

3785 

• 

H 

Total 

^955 
19415 

NPK 
Total 

3965 
14630 

II 

N 

3175 

NK 

3970 

(NPK  confounded) 

P 

3980 

NP 

4255 

K 

1+420 

0 

3665 

NPK 

U580 

PK 

4315 

Total 

16155 

Total 

16205 

III 

(P  x  K  confounded) 

NP 

44o? 

NPK 

3510 

P 

4575 

N 

4305 

NK 

3910 

PK 

3995 

K 

1+065 

0 

4030 

Total 

16955 

Total 

15840 

IV 

(N  x  K  confounded) 

N 

3750 

P 

3190 

PK 

3920 

NPK 

3575 

NP 

4175 

NK 

3530 

K 

3230 

0 

2900 

Total 

15125 

Grand  Total  = 

Total 
127,520 

13195 

The  treatment  totals  required  for  the  computation  of  the  sums  of  squares  are  arranged 
in  Table  8  for  the  totals  of  the  four  blocks,  and  for  the  omission  of  each  replica- 
tion. 


Table  9«  Treatment  Totals  Required  for  Calculation  of  Sums  of  Squares 


Treat- 

All 

Minus 

Minus 

Minus 

Minus 

ment 

Replications 

Replication 

I  Replication  II 

Replication  III 

Replication  IV 

0 

.  I3805 

10595 

10140 

9775 

10905 

IT 

-  16185 

11230 

13010 

11880 

12435 

P 

16880 

11745 

12900 

12305  . 

13690 

K 

15435 

II765 

11015 

11370 

12155 

NP 

16620 

12835 

12365 

12215 

12445 

NK 

16010 

11410 

12040 

12100 

12480 

PK 

16955 

12230 

12640 

12960 

13035 

NPK 

15630 

II665 

11050 

12120 

12055 

(1) 


(2) 


(3) 


(*0 


(5) 


(6) 


The  sums  of  squares  can  be  computed  as  follows  for  the  treatment  effects   (for  1  d.f .) 

N  =   1/2  k   [(N  +  NP  +  NK  +  NPK)    -   (0  +  P  +  K  +  PK)]   2 

P  =   1/2  k   [(P  4.  NP  +  PK  +  NPK)    -   (0  +  N  +  K  +  NK)J  2 

K  =   1/2  k   E(K  +  NK  +  PK  +  NPK)    -   (0  +  N  +  P  +  NP)]   2 

N  x  P  =  1/2  k   [(N  +  P  +  NIC  +  PK)    -(0  +  NP  +  K  +  NPK)]  2 
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N  x  K  =  1/2  k  [(N  +  K  +  KP  +  PK)  -  (0  +  P  i  M  +  NPK)]  2 
P  x  K  =  l/2  k  [(P  +  K  +  MP  +  NK)  -  (0  +  N  +  PK  *■  NPK)]  2 
H  x  P  x  K  =   l/2   k  [(N  +  P  +  K  +  NPK)    -   (0  +  NP  +  PK  +  NIC)]2 

For  example,   the   interaction  N  x  ?  ia   calculated  from  the  replications   in  which  it   is 
not   confounded,    i.e.,   from  Column  3,  Table  9-     Note  that  k  =  12. 

N  x  P  =  l/24  [(11230  +  117^5  +  11410  +  12230)    -   (10595  +  II763  +  12835  +  11665)]2 
=  l/24  [  1+6615   -  46860]2 
=  l/2'+  [245] 2  =  60025/24  =  2501.04 

Similarly, 

N  x  P  x  K  =  l/24  [(13010  +  12Q00  4  11015  +■  11030)  -  (101.40.-*-  12365  4  12040  *•  12640)]2 
N  x  P  x  K  =  l/24  [(47973  -  47185)1  2  =  1/24  [790] 2 
=  624,100/24  =  26,004 

The  main  effects  are  calculated  from  all  the  replications;  i.e.,  k  -  16.  The  calcu- 
lation for  N  is  as  follows: 

N  =  1/32  [(16135  +  16620  +■  16010  +  I563O)  -  (13803  4  16880  +  15435  +  I6953)'] 2 
=  1/32  ['64445_  63075] 2  -   i/32   [1370]  2 
=  l,37b,900/32  =  58,653 

The  total  sura  of  squares  is  calculated  from  all  plot  yields  in  all  replications  of 
the  experiment,  i.e.,  32  plots.  The  block  sum  of  squares  is  computed  from  the  8 
block  totals.  The  ordinary  method  of  computation  is  used. 

The  analysis  of  variance  can  be  set  up  as  follows: 

Table  10.  Complete  Analysis  for  Partially  Confounded  2x2x2  Experiment 


Variation 

Sura 

Mean 

due  to 

D.F. 

Squares 

Square 

Blocks 

N 

P 

K 

7 
1 
1 
1 

5, 

786,912 

58,653 

675,703 

9,112 

326,702 

53,653 

675,703 

9, 112 

N  x  P 
N  x  K 
P   x  K 
N  x  P  x  K 

1 

1  . 
1 
1 

30,817 

65,626 

26,004 

2,301 
36,8.17 
65,  626 

26'.  oo4 

Error 

17 

2, 

393,572 

l4o,9l6 

Total 

31 

9^ 

056,900 

Obtained  ' 


Value 


5  Pet.  Point 


5 

•Of 

2 

.40 

k 

.79 

15 

.46 

3 

•33 

2 

.15 

5 

.42 

2.70 
243.91 

4.43 

243.91 
243.01 

243.91 
243.91 
243.91 


In  this  experiment,  information  is  obtained  on  the  main  effects  and  on  all  interact 
tj.ons,  including  the  second  order  interaction.   However,  there  is  a  loss  of  one-fourth 
the  information  on  each  of  the  interactions,  due  to  the  fact  that  the  replication  in 
which  an  interaction  was  confounded  was  omitted  in  the  calculation  of  its  sum  of 
squares.  The  error  is  of  approximately  the  same  magnitude  as  that  for  the  experiment 
in  which  N  x  P  x  K  was  completely  confounded. 
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Questions  for  Discussion 

1.  What  is  a  factorial  experiment?  Give  an  example. 

2.  Under  what  conditions  may  a  factorial  experiment  he  used? 

3.  What  is  meant  "by  the  term  "orthogonality"? 
Give  an  example  of  an  orthogonal  experiment. 

k.   Explain  the  use  of  the  term  "confounding".  What  is  done  in  confounding?  Why? 

5.  Suppose  a  second-order  interaction,  N  x  P  x  K  is  to  be  confounded.  How  can  this 
he  done  "by  design? 

6.  What  is  partial  confounding?  How  does  it  differ  from  confounding? 


Problems 

Some  uniformity  data  presented  by  Wiebe  (1935)  on  wheat  yields  in  grams  per  row  are 
presented  below  as  they  occurred  in  the  field: 


Plot 

Blocks 

No. 

I 

II 

III 

IV 

1 

670 

690 

785 

6U5 

2 

685 

790 

770 

665 

3 

660 

825 

960 

750 

h 

705 

805 

860 

635 

5 

610 

720 

705 

615 

6 

6^0 

735 

805 

665 

7 

690 

855 

905 

700 

8 

715 

765 

9^5 

820 

1.  Calculate  these  data  as  a  randomized  block  experiment  using  the  8  fertilizer  treat- 
ments given  in  the  text  example. 

2.  Design  an  experiment  so  as  to  confound  the  second  order  interaction,  N  x  P  x  K. 
Carry  through  the  complete  analysis.  Compare  the  results  with  those  obtained  in 
problem  1. 

3.  Design  an  experiment  to  superimpose  on  these  data  so  as  to  partially  confound  the 
second  order  interaction  (N  x  P  x  K) .  Carry  through  the  complete  analysis.  Com- 
pare the  results  with  those  obtained  in  problems  1  and  2. 


CHAPTER  XX 

SYMMETRICAL  INCOMPLETE  BLOCK  EXPERIMENTS 

I.  Incomplete  Block  Teste 

It  has  "been  shown  (Chapter  19)  that  greater  accuracy  is  obtained  in  factorial  ex- 
periments when  certain  degrees  of  freedom  for  the  higher-order  interactions  are  con- 
founded with  "blocks,  especially  when  the  number  of  combinations  is  large.  In  varie- 
ty trials  it  is  sometimes  desirable  to  test  a  large  number  of  varieties  in  a  single 
experiment.  To  compare  them  in  an  ordinary  randomized  block  test  leads  to  less 
accuracy  due  to  the  large  size  of  the  blocks.  Methods  have  been  developed  by  Yates 
(1936,  1937)  to  overcome  this  difficulty.  The  procedure  is  analogous  to  confounding 
in  factorial  experiments  in  that  the  replications  are  divided  up  into  smaller  blocks 
which  are  used  as  error  control  units.  These  small  blocks  contain  only  part  of  the 
total  number  of  varieties,  hence  the  name  "incomplete  blocks". 

Incomplete  block  experiments  have  been  shown  to  give  increased  efficiency  by  Yates 
(1936},  and  Goulden  (1937).  Weiss  and  Cox  (1939)  found  the  lattice  square  arrange- 
ment to  result  in  a  gain  of  l';0  per  cent  on  extremely  heterogenous  soil,  but  a  loss 
of  precision  of  3I>5  per  cent  on  a  very  uniform  soil. 

One  type  of  incomplete  block  experiment  will  be  illustrated,  i.e.,  the  symmetrical 
incomplete  block  where  all  possible  groups  of  sets  are  used.  The  computation  pro- 
cedure will  follow  closely  that  described  by  Weiss  and  Cox  (1939) •  For  other  types 
of  incomplete  blocks,  Goulden  (1937? 1939)  should  be  consulted.  These  include  the 
two  dimensional  quasi -factorial  with  two  groups  of  sets,  and  the  three  dimensional 
quasi -factorial  with  three  groups  of  sets.  An  excellent  discussion  of  the  lattice 
square  design  (quasi-Latin  squares)  is  given  by  Weiss  and  Cox  (1939)  who  applied  it 
soybean  variety  test. 

The  computations  will  be  illustrated  with  some  uniformity  trial  data  obtained  from 
Dr.  R.  M.  Weihing  on  forage  yields  of  crested  wheat grass  expressed  in  kilograms. 
The  plots  consist  of  3-^ows,  15  feet  long,  the  individual  rows  being  6  inches  apart. 

I I .  Design  of  Symmetrical  Incomplete  Block  T ests 

In  order  to  determine  the  details  of  an  acceptable  design  with  regard  to  the  number 
of  varieties  and  blocks  to  use,  it  is  necessary  to  satisfy  the  condition  that  each 
variety  occur  with  every  other  variety  in  the  same  number  of  blocks.  Suppose  that 
m  varieties  are  replicated  n  times  over  a  portion  of  the  available  blocks  each  of 
which  is  to  contain  n'  plots.  For  example,  suppose  that  one  considers  the  n  plots 
in  which  one  certain  variety  occurs.  The  total  number  of  plots  contained  in  these 
n  blocks  is  obviously  (n)(n'),  of  which  n  corresponds  to  the  one  variety  under  con- 
sideration. Therefore,  there  are  (n)(n'-n)  =  (n'-l)(ri)  plots  available  for  the 
other  m  -  1  varieties  in  those  blocks.  To  meet  the  above  condition,  these  (n'-l)(n) 
plots  must  be  distributed  equally  among  the  m-1  varieties  that  remain.  For  this 
reason,  (n'-l)(n)  must  either  equal  m-1  or  be  a  multiple  of  ja-1 .  Thus,  it  becomes 
apparent  that  m-1  =  (n'-l)(n)  is  a  number  chat  must  be;  factorable,  preferable  into 
two  numbers  of  nearly  equal  size.  This  can  be  effected  in  two  different  ways. 

(1)  First,  one  may  use  m  =  k:-,  where  k  is  an  integer,  from  which  m-1  =  k2-l  =  (k-l) 
(k+1).  From  this,  it  would  appear  that  the  choice  of  design  could  be  either  k-l  = 
n'  -  1  (i.e.,  n'  =  k  whence  n  will  be  k  +  1),  or  k  +  I  *  n'  --  1  (i.e.,  n'  «  k  +  2) 
in  which  case  n  will  be  k-l.  However,  when  in'  is  equal  to  the  total  number  of 
blocks,  one  must  have  ran  =  m'n' .  Thus,  it  is  clear  that  mn  must  be  divisible  by  n1 . 
The  first  choice  gives  mn  to  be  k^(k  +  1)  in  which  the  divisibility  is  assured  with 
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m'  =  k(k  +  1) .  The  second  choice  gives  ran  a  k£{k>lj,  which  generally  would  not  he 

n1   k  +  2 
an  integer.  Thus,  only  the  first  choice  is  acceptable. 

(2)  Second,  one  may  choose  m  =  k2-  k  +  1,  from  which  to  -  1  =  Is^-k  =  k(k-l).  From 
this  relationship,  it  appears  that  one  has  a  choice  of  design  by  the  use  of  either 
k  -  1  =  n"  -  1  (i.e.,  n'  =  k)  from  which  n  will  also  be  k,  or  k  a  n»-l  (i.e.,  n'  = 
k  +  1)  in  which  case  n  will  be  k  -  1.  In  the  analysis  of  this  situation,  mn  = 

(l^-k+Dk.  The  divisibility  is  assured  with  the  result  that  m'  =  k^-k+l.  For  the 

k 
second  choice,  mn  =  k2-k+l,  a  value  that  is  not  generally  divisible.  Thus  only  the 

n»    k  +  1 
first  choice  is  acceptable. 

Therefore,  it  is  obvious  that  designs  of  this  nature  can  be  constructed  for  m  =  k2 
where  m  varieties  =  9>l6,25,36,49,64,  etc.  The  k^-k+1  type  can  be  designed  for 
values  of  m  =  7,  13,  21,  31>  43,  57  j  73  >  etc.  The  structure  of  the  arrangements  is 
rather  fully  discussed  by  Yates  (1936),  Fisher  and  Yates  (I938),  and  by  Goulden 
(1937,  1939). 

The  first  type,  1  =  k^  s  n'2,  will  be  used  to  illustrate  the  process  for  a  complete- 
ly orthogonal i zed  5  by  5  square.  This  will  give  a  series  of  symmetrical  incomplete 
block  arrangements. 


1111 

(1) 

2222 

(2) 

3333 

(3) 

Ij-ii-U^ 

w 

5555 

(5) 

2345 

(6) 

3^51 

(7) 

4512 

(8) 

5123 

(9) 

1234 

(10) 

3521+ 

(11) 

*U35 

(12) 

3241 

(13) 

1552 

(14) 

2413 

(15) 

42<>3 

(16) 

531^ 

(17) 

1425 

(18) 

2531 

(19) 

3142 

(20) 

5432 

(21) 

15^3 

(22) 

215^ 

(23) 

3215 

(24) 

4321 

(25) 

The  explanation  of  the  arrangement  is  taken  directly  from  Weiss  and  Cox  (1939)-  The 
numbers  in  parentheses  designate  the  varieties  which  are  to  be- compared  in  the-  ex- 
periment.  "These  variety  numbers  are  arranged  in  6  orthogonal  groups  as  follows: 


Group 

I 

Group 

II 

Group 

III 

( rows ) 

( Columns \ 

(first  Number) 

1  2 

3  k 

5 

1 

6 

11 

16 

21 

1 

10 

14 

18 

22 

6  7 

8   9 

10 

2 

7 

12 

17 

22 

2 

6 

15 

19 

23 

11  12 

13  14 

15 

3 

8 

13 

18 

23 

3 

7 

11 

20 

24 

16  17 

18  19 

20 

4 

9 

14 

19 

24 

4 

8 

12 

16 

25 

21  22 

23  24 

25 

5 

10 

15 

20 

_25 

J_. 

.  9 

15 

17 

21 

Group 

IV 

Group 

V 

Group 

VI 

( 

second  number) 
12  20   23 

1 

(third  number) 
8   15   17   24 

(fourth  number) 

1  9 

1 

7 

13 

19 

25 

2  10 

13  16 

24 

2 

9 

11 

18 

25 

2 

8 

14 

20 

21 

3  6 

14  17 

25 

3 

10 

12 

19 

21 

3 

9 

15 

16 

22 

4  7 

15  18 

21 

4 

6 

13 

20 

22 

4 

10 

11 

17 

23 

5  8 

11  19 

22 

5 

7 

14 

16 

23 

5 

6 

12 

18 

24 

In  group  I  the  variety  numbers  are  copied  from  the  rows  of  the  square,  each  row  of 
the  group  specifying  a  block  in  the  field.  In  like  manner,  the  variety  numbers  in 
the  blocks  of  group  II  are  taken  from  the  columns  of  the  square.  In  group  III  the 
varieties  in  a  block  are  specified  by  the  numbers  written  first  in  the  cells  of  the 
square.  Thus,  the  varieties  in  the  first  block  are  those  corresponding  to  number 
1  wherever  it  occurs  first  in  tho  cell;  as  examples,  variety  1  is  from  row  1  column 
1  of  the  completely  orthogonalized  square,  variety  10  from  row  2  column  5^  variety 
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lh  from  row  3  column  h,  .etc.  "For  group  IV,  the  second  numbers  in  the  cells  of  the 
square  ere  used  to  pick  out  the- varieties .  Thus,  for  the  third  "block  the  number  3 
is  located  in  row  1  column  3  (variety  y) ,    in  row  2  column  1  (variety  6),  etc. 

"This  set  of  six  orthogonal  groups  constitutes  a  balanced  incomplete  block  arrange- 
ment:  in  the  30  blocks  of  5  plots,  each  of  the  25  varieties  occurs  6  times,  once 
and  once  only  with  every  other  variety.  The  combination  solution  in  the  unreduced 
form  would  require  a  prohibitive  number  of  blocks^". 


The  field  arrangement  for  this  typo  of  symmetrical  incomplete  block  design  will  be 
illustrated  with  the  crested  wheatgraas  data.  There  are  25  varieties  arranged  in 
6  replicates  with  5  varieties  in  each  block.  The  5  varieties  ere  randomized  within 
each  block.  The  block  and  replicate  arrangement  in  the  field  may  be  as  follows cV 


I 

II 

III 

IT 

V 

VI 

5b 

6b 

13b 

20b 

2Vb 

27b 

1ft 

7b 

lib 

I8b 

25b 

29b 

lb 

10b 

12b 

1 . 

lob 

21b 

26b 

2b 

8b 

lift 
• 

19b 

22b 

23b 

Jb 

9"b 

15b 

17b 

23b 

30b 

III.  Statistical  Analysis  of  Incomplete  Block  Data 


The  symbols  used  in  the  discussion  follow: 

m  =  number  of  varieties  (25) 

n'  =  number  of  plots  per  block  (5) 

n  =  number  of  replicates  of  each  variety  (6) 

ia'  =  number  of  blocks  (30) 

N  =  inn  =  m;n'  =  total  number  of  plots  (150) 

X  =  n(n' - 1)  =  number  of  times  any  2  varieties  occur  together  in  a 

m-1  block  (1) 

E     =      1-l/n'   =  Efficiency  Factor  of  Design,    (5) 
1-T/m  '        (6)' 

Sx  =     Sum  of  all  N  experimental  values      ( 217. 79) 
S 'x  =  Svsa.  of  n  experimental  values  for  any  one  variety. 

V 

S 'x  =  Sum  of  k  experimental  values  for  any  one  block. 
B_ 

s^  =  Error  variance  of  a  single  experimental  value, 


^"b  =_Q__=   C 

m  n'   25  5 


251  =  53, 130 > 
5:  201 


.vThe  blocks  (5b,Vb,  etc.)  were  arranged  consecutively  for  the  analysis  of  the  data 
used  in  this  problem,  but  they  should  be  randomized  (at  least  within  replicates)  in 
an  actual  field  experiment.  The  Roman  numberals  refer  to  replicates. 
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(a)  Computation  of  Block  Totals 

The  yield  data  for  the  incomplete  "block  experiment  may  he  assembled. as  shown 
in  Table  1  for  the  computation  of  the  block  totals.  The  numbers  in  parentheses 
refer  to  "varieties".  The  forage  yields  of  crested  wheatgrass  are  expressed  as 
kilograms  per  plot . 

Table  1.  Plot  Yields  of  the  Symmetrical  Incomplete  Blocks  Assembled  for  25  Crested 
Wheatgrass  "Varieties"  in  6  Replicates. 


Beplicate 


Set  or. 
Block 


Plots  in  Block 


Block 

Totals 


II 


III 


IV 


VI 


o 

7 

8 

9 

10 

n 

12 
13 

15 

16 

17 
18 

19 
20 

21 
22 

23 

2k 

25 

26 

27 
28 

29 

30 


(5 

(10 

(11 

(20 
(21 

(1 
(12 

(23 
(19 
(15 

(18 
(15 

(20 

ik 

(13 

(l 
(21* 

(3 

(* 

(22 

(15 
(9 
(19 
(22 
(16 

(1 

(2 

(22 

(10 

(6 


1.25 
1.38 
1.1*2 
1.20 
1.1+8 


1.86 
1.61* 
1.81* 

1.33 

1.50 

1.00 

1.62 
1.60 
1.30 
1.56 

1.1*8 
1.1*8 
1.1*8 
1.1*0 

1.50 

1.22 
1.62 
1.32 
1.12 
1.08 


(2 
(7 

(1* 

(18 

(25 


I.27  (16 

1.92  (17 

1.61  (3 

l.Ol*  (9 

1.31*  (25 


(1 
(2 
(7 
(12 
(9 

(20 

(16 

•(6 

'(7 

(5 

(3 

(2 

(12 

(20 

(7 

(25 
(8 
(9 

(17 
(18 


1.52 
1.1*8 
1.1*0 
1.32 

1.1** 

1.51* 
I.36 
1.16 
1.1*2 
1.38 

1.91* 
1.61* 
1.72 
1.5^ 
1.18 

1.30 
1.1*8 

1.1*1* 

1.31 

1.1*1* 

1.72 
1.1*8 
1.1*0 
I.38 
1.35 

1.22 
1.1*3 

0.93 
1.50 
1.00 


(3 
13 
17 
23 

21 
(2 
13 
11* 
10 

11* 

23 
11 
16 
21 

23 
13 
17 
21 
(8 

17 
11 
21 
(h 
(5 

(7 
(20 

(3 

(11 
(5 


1.30  (3 
1.1*1  (9 
1.35  (15 
1.59  (16 
1.16  (22 


1.66 
1.32 
1.1*7 
1.16  (21* 
1.02-  (5 


(6 
(22 

(8 


1.81* 
1.66 
1.20 
1.16 

1.21* 

1.21* 
1.66 
I.60 
1.1*5 
1.31* 


(22 

(19 

(3 

(8 

(5 

(12 
(10 
(11* 

(15 
(11 


1.68.  (2l* 
1.29(25 
1.26  (3 
1.1*2  (13 
1.61*  (23 


1.70 
1.50 
1.1*6 
1.13 


(13 
(21 
(15 

(k 


1 .08  -  ( 12 


)    1.83 

(1) 

)    1.52 

(6) 

)   1.32 

(12) 

)      1.21 

(19) 

)     1  -5k 

(21*) 

)     1.81 

(11) 

)     1.38 

(7) 

)      1.21* 

(18) 

)      0.92 

w 

)      1.22 

(20) 

)     1.96 

(10) 

)     1.78 

(6) 

)      1.29 

(21*) 

)    '1.1*8 

(25) 

)    "1.31 

(17) 

)    -1.61* 

(9) 

)     '1.62 

(2) 

)    '1.56 

(25) 

)     1.51 

(18) 

)     1.58 

(19) 

)   '1.9^ 

(1) 

)    '1.36 

(18) 

)     1.1*1* 

(10) 

)     1.7^ 

(6) 

)     l.oo 

(i*0 

)'   1.56 

(19) 

)     1.1*1* 

(11*) 

)     1.36 

(16) 

)     1.19 

(23) 

)     1.32 

(21+) 

1.61* 

1.5k 

1.1*6 

7.25 

1.19 

6.68 

1.22 

6.51* 

I.67 

6.99 

I.96 

8.21* 

1.65 

7.63 

1.32 

6.32 

0.99 

5.55 

1.72 

6.68 

1.92 

9.52 

I.83 

8.55 

I.29 

6.89 

I.5I* 

7.05 

1.53 

6.80 

1.62 

•6.80 

I.83 

■8.21 

1.81* 

S.oi* 

1.72 

7.29 

1.25' 

7.17 

1.81* 

8.66 

I.67 

7.28 

1.86 

7.1*1* 

1.55 

7.^9 

I.32 

7.^9 

1.60 

7.30 

1.56 

7.55 

1.1*1 

6.1*8 

1 .  10 

6.09 

1.31 

5.79 

Grand  Total 


217.79 


(h)  Computation  of  Variety  Means 

In  symmetrical  incomplete  block  designs,  a  preliminary  step  is  required  to 
obtain  the  sum  of  squares  for  varieties.  Due  to  the  fact  that  variety  differences 
are  partially  confounded  with  block  effects,  it  is  necessary  to  compute  each  variety 
sum  by  a  formula  that  involves  both  the  yields  of  the  plots  planted  to  the  variety 
and  the  yields  of  the  blocks  in  which  the  variety  occurs. 


23h 

The  first  step  is  to  accumulate  the  variety  sums  which  are  recorded  in 
table  2,  column  2.  The  yields  for  each  variety  are  collected  from  table  1.  For 
example,  the  total  yield  of  variety  1  is: 

S'  -  1.61+  +  1.27  +  l-9!+  +  1.00  +  1,8k  +  1.22  =  8.91 

V 

For  each  variety  total  there  is  also  a  sum  of  "block  total   (S'S'x)  which  is  recorded 

V  B 
in  table  2.     Since  variety  1  appears   in  "blocks  1,    6,    11,    16,   21,   and  25,   S'S'x  = 
7.ih  +  8.2I+  -i-  9.52  +  6.80'+  8.66  +  7. 30  »  1+8,06  V  b 


Table  2.     Computation  of  Variety  Means  for  the  Crested  Wheatgrass  Experiment  with  2\ 
"varieties"   in  6  Replications. 


'arie"cy 


Block  tots . 

Replicate 

Variety  for  each 

n'Sx   - 

Q 

Variety 

II         III 

IV 

V 

VI  Totals  S'x 
V 

V 

0  tq  t  v 

k>     Q     JL 

25 

Means 

V  B 

Yields   in  Kg. 

S'x 

S  'S  'x 
V  B 

Q 

d 

Sx/N 

*  d 

1 

1.61+ 

1.27 

1.9!+ 

1.00 

1.61+ 

1      i"»  fj 

8.91 

1,3.06 

-3.51 

-0.1I+ 

1.31 

2 

1.52' 

i.32 

1.61+ 

1.83 

1.1+8 

1.62 

9 .1+1 

1+6.76 

+0,29 

+0 .  01 

1.1+6 

3 

1.83 

1.16 

I.29 

1.60 

1.1+1+ 

1.1+6 

O".     (O 

•'+3.21 

+0 .  69 

4-0 .  03 

1.1+8 

1+ 

1.30 

0.99 

1.33 

1.30 

1.1+2 

1,19 

7.53 

1+0.99 

-3.34 

-0 .  13 

1.32 

5 

1.25' 

1.22 

1.35 

1.1+1+ 

1.61+ 

1.08 

7.98 

1+1.1+7 

-1.57 

-0.06 

1.39 

6 

1.1+6 

1.31 

I.G3 

1.1+1+ 

1.55 

1,08 

9.17 

1+5,36 

4-0,1+9 

+0 .  02 

1.1+7 

7 

1  .US 

I.65 

I.27 

1.31 

1.35 

1.70 

8.76 

^-3.83 

-0.05 

0 .  00 

1.1+5 

8 

1.1+1 

1.24 

1.1+6 

l.3k 

1.72 

I.H3 

P.    ,<o 
O  .  DC 

1+I+.50 

-1.1+0 

-0 .  06 

1.39 

9 

1.52 

1.1+2 

1.18 

1.62 

1.1:8 

0.93 

0.13 

1+0 .  Ik 

40.61 

4-0.02 

1.1+7 

10' 

1.33- 

1.02 

1.92 

1.62 

1,86 

1.12 

O.    no 

1+5.19 

-0.59 

-0 .  02 

1.43 

11' 

L.l+2 

1.96 

1.20 

1.38 

1,29 

1     i  A 
...  j.  1^.. 

3.63 

^2.3? 

+0.80 

+0 .  03 

l.i+3 

12 

1.19 

1.92 

I.5I+ 

1 .61+ 

1.U0 

1.32 

9.01 

kl.39 

*3-S6 

+0 .  15 

1.60 

13 

1.35 

1.1+7 

1.50 

1.66 

1.71+ 

1,56 

9.28 

4  3.^.0 

+3 .  10 

+0.12 

1.57 

ll+ 

1.1+0 

1.16 

1.81+ 

1.56 

1.32 

1.56 

8.81+ 

1+1+.81 

-0.61 

-0.02 

1.1+3 

15 

1.32 

1.3^ 

1.61+ 

1.51 

1-.1+8 

1.36 

8.65 

kk.3k 

-.1.09 

-0.01+ 

1.1+1 

16 

i  .21 

1.5k 

1.16 

1.1+8' 

1.50 

i.i+i 

8.30 

1+1+.01 

-2.91 

-0.10 

1.35 

17 

1.59 

1.36 

1.33 

1.60 

1.68 

1,50 

9.36 

U3.76 

■i-2 .  54 

+0.10 

1.55 

1  0 

1.32 

1.32 

1.86 

1.72 

1.67 

1.00 

8.89 

1+3.24 

-1.21 

+0 .  05 

1.30 

19  ' 

1.22 

1.01+ 

1.78 

1.25 

1.1+8 

1.60 

8.37 

1+2 .  53 

-0.63 

-0.03 

1.1+2 

20 

1.20 

1.72 

1.81+ 

I.30 

1        -2.0 

1.56 

8.91+ 

1+1,95 

+2.75 

+0.11 

1 .  56 

21 

1.1+8 

1.66 

1.21+ 

1A5 

1.26 

1.1+1+ 

8.33 

I+1+.31 

-1.66 

-0.07 

I.38 

22 

1.5^ 

1.38 

1.96 

1.36 

1.1+0 

1,32 

9.16 

1+5.23 

+0 .  52 

+0 .  02 

I.V7 

23 

1.16 

1.63 

1.66 

1 .21+ 

1.68 

1,10 

■8.47 

1+2. 7I+ 

-0.39 

-0.02 

i.1+3 

2k 

1.67 

0.92 

1.29 

1.62 

I.9I+ 

1.31 

8.75 

1+2.07 

.     +1.68 

+0 .  07 

1.52 

25  ' 

1.11+ 

1.3S 

1.5k 

1.81+ 

I.36 

]     00 

8.1+3 

1+3.31+ 

-0.9^ 

-0 .  01+ 

1.1+1 

Totals 


)* 


217.79^1088,95 


0.00 


0.00 


\!/The  sum  of  the  S'x  column  (217 .79)  is  equal  to  Sx,  while  the  sum  of  the  S'S'x  column 
V  V  B 

is  equal  to  n'Sx.  Therefore,  the  computations  car.  be  verified:   (5)  (217. 79)  = 
1088.95. 
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For  the  computation  of  Q,  the  "block  sums  are  subtracted  from  5  times  the  variety- 
totals,  i.e.,  Q  =  n'S'x  -  S'S'x 

V     V  B 

For  example,   for  variety  1, 

Q  =  5(3.91)  -  bQ.o6  =  -5.51 

The  Q  value  is  then  divided  "by  the  number  of  varieties  in  the  test  (25)  to  give  the 
values  for  d  in  table  2.  Thus,  d  is  the  deviation  of  a  variety  mean  from  the  mean 
yield  of  all  the  varieties  in  the  experiment. 

The  best  estimate  of  the  variety  means  is  Sx/N  +  d. 

As  an  illustration,  the  mean  of  variety  1  is, 

Sx/N  +  d  =  217.79/150  +  (-0.1*0  =  1.31- 

In  the  variety  means,  consideration  has  been  given  to  the  effect  of  partial  confound- 
ing of  variety  differences  with  block  effects.  They  are  the  best  estimates  of  ■  the 
yield  performance. 

(c)  Derivation  of  Sums  of  Squares 

The  sums  of  squares  may  now  be  computed.  The  correction  factor  is  the  square 
of  the  total  divided  by  the  number  of  plots,  viz., 

(Sx)2  =  (217. 79)2  =  klM2.kQkl     =  316.22 
N         150  150 

The  total  sum  of  squares  is  obtained  in  the  usual  manner,  i.e.,  by  the  addi- 
tion of  the  squares  of  each  individual  plot  yield  with  the  correction  factor  sub- 
tracted: 

(1.25)5+  (1.52)2+ +(1.31)2-316.22  =  8.23  -   

The  sums  of  squares  between  means  of  blocks  is  obtained  by  the  addition  of 
the  squares  of  the  block  totals,  these  being  divided  by  the  number  of  plots  which 
make  up  each  block  total.  The  correction  term  is  subtracted  from  this  value. 


(7.5*02  +  (7.25)2  z   (5-79)2 

5 


316 .22 


fc.17 


The  sum  of  squares  between  means  of  varieties  is  obtained  from  each  Q  value 
squared,  added,  and  divided  by  N: 

(-3.51)2  +  (0.29)2+ +(-0.9*Q2  =  0.56 

150 


The  analysis  of  variance  is  presented  in  table  3. 
Table  3«  Analysis  of  Variance  of  Symmetrical  Incomplete  Block  Design 


Source  of 
Variation 


D.F., 


Sum  of 
Squares 


Mean 
Square 


F -Value 


Blocks 
Varieties 
Error 
Total 


29 
2k 
96 


U.17 
0.56 

3-50 
8.23 


0  ..1U58 
0.0233 
.0.0365 


3. 9'+** 
1.57 


2% 

The  standard  error  of  the  plot  yields  is 

s  =  /O.O365  =  0.19  kilograms 

The  standard  error  of  the  difference  "between  two  of  the  corrected  means  will  "be 


2sf 

n 


n'  +  1 
n' 


=  /( 2)  (O.O565)   6.  =  0,12 


IV.  Efficiency  Factor 

The  symmetrical  incomplete  "block  design  is  less  efficient  than  the  complete  ran- 
domized "block  arrangement  for  equal  numbers. of  replications  when  the  soil  is  homo- 
genous. This  is  because  there  has  been  no  reduction  in  error  variance  duo  to  the  re- 
duction of  block  size.  The  efficiency  of  the  incomplete  "block  design  as  compared  to 
randomized  complete  blocks  is  expressed  by  the  fraction,  1  -  1 /n ' ,  when  the  rcplica- 

1  '-   1/m" 
tier,  numbers  in  each  arrangement  are  equal.  In  soils  that  are  heterogenous  the  re- 
duction in  block  size  usually  more  than  compensates  for  the  loss  of  information  due 
to  the  arrangement.   G-oulden  (1937)  concluded  that  an  increase  of  precision  of  20 
to  50  per  cent  was  obtained  over  the  complete  randomized  block  arrangement. 

In  addition  to  the  doubtful  value  of  the  symmetrical  incomplete  block  design  on  very 
uniform  soils,  Weiss  arid  Cox  (1939)  advise  that  the  design  not  be  employed  to  com- 
pare varieties  which  have  an  extremely  large  range  in  yields,  However,  poor  varie- 
ties are  usually  eliminated  in  preliminary  trials.  The  symmetrical  incomplete  block 
arrangement  would  provide  a  means  to  accurately  determine  relatively  small  differ- 
ences between  select  varieties. 


Gculden  (1939)  gives  a  list  of  the  n1  and  n  values  for  different  numbers  of  varieties 
for  which  symmetrical  incomplete  blocks  may  be  used: 


Wo. 
Varieties 


13 
16 
21 
25 
31 
h9 
57 
6k 

73 


No.  Plots  in 
one  block  (n* ) 


h 
k 
5 
5 
6 

7 
8 
3 
9 


No.  Replications 
for  Each  Variety  (n) 


6 
6 
8 
8 
9 
9 
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Questions  for  Discussion    r 

1.  Why  is. an  ordinary  randomized  block  design  inaccurate  for  comparisons  of  a  large 
number  of  varieties? 

2.  What  principles  are  involved  in  incomplete  block  tests?  What  is  a  symmetrical 
(or  balanced)  incomplete  block? 

3.  Explain  how  to  write  out  the  sets  for  a  completely  orthogonal i zed  5  by  5  square. 

4.  What  variations  in  field  lay-out  are  permissable  with  a  symmetrical  incomplete 
block  test? 

5.  How  does  the  computation  for  variety  sums  of  squares  differ  from  that  for  an, 
ordinary  randomized  block? 

6.  What  is  the  efficiency  factor?  Compare  the  efficiency  of  a  symmetrical  incom- 
plete block  test  with  that  for  a  randomized  block  trial. 

7.  What  are  the  limitations  in  the  use  of  the  symmetrical  incomplete  block  design? 

8.  How  would  you  arrange  a  variety  test  so  as  to  be  able  to  fit  47  varieties  into 
a  symmetrical  incomplete  block  test? 

Problems 

1.  It  is  desired  to  conduct  a  symmetrical  incomplete  block  test  for  lo  varieties 
of  wheat.  The  form  to  be  used  will  be  m  =  n'2.  A  4  by  It-  orthogonal! zed  square 
is  given  below.  Write  out  the  sets  for  the  different  blocks  for  each  replicate* 

111  234  3U2  423 

222  143  431  31^ 

333  [:-12  124  2  lH 

444  321  213  132 

2.  Some  uniformity  trial  data  on  wheat  nursery  plots  were  as  follows  in  grams  per 
15  foot  row  (Data  from  Dr.  G.  A.  Wiebc)  : 


695  860  960  725  615 

735  910  Q75  775  680 

645  745  815  700  605 

630  $10  730  635  535 

680  745  840  730  645 

620  730  775  680  610 

620  745  660  565  520 

560  675  690  635  525 

625  706  725  6fc  645 

700  765  725  615  640 

685  78^  655  6c6  570 

625  556  •  590  590  605 

745  790  675  600  625 

680  .  670  630  64o  645 

655  730  615  650  640 

625  700  675  720  695 


Use  the  incomplete  block  sets  written  for  problem  1  and  apply  the  above  yields 
to  them.  Calculate  the  data  for  a  synmetrical  incomplete  block  design. 


CHAPTEB  XXI 
MECHANICAL  PROCEDURE  IN  FIELD  EXPERIMENTATION 

I.  General  Considerations 

The  experimental  farm  should  "be  kept  neat,  clean,  and  in  order  at  all  times.  Weeds 
should  "be  hoed  from  plots  anil  allays  and  all  trash  destroyed.  Alleys  and  roadways 
should  he  hoed  or  cultivated  unless  seeded  to  grass.  Straight  plot  rows  add  to  the 
general  attractiveness  and  in  some  cases  to  accuracy. 

(a)  Crop  Potation  Scheme 

Zavitz  (1912)  states  that  it  is  essential  to  havo  a  rotation  plan  for  the 
entire  experimental  farm  in  order  to  maintain  soil  fertility.  In  addition,  accurate 
maps  should  he  kept  for  the  different  fields  so  that  a  continuous  record  exists  as 
to  the  crops  grown  on  each  field  for  all  past  years.  A  rotation  scheme  prevents  mix- 
tures in  small  grain  nurseries  as  well  as  on  other  plots  since  volunteer  grain  may 
contaminate  seed  plots  where  the  same  crop  was  on  the  land  the  previous  year.  -On  the 
Colorado  Station  farm  it  has  been  found  advisable  to  fallow  some  of  the  fields  to 
equalize  the  soil  moisture  duo  to  the  effect  of  irrigation  and  for  weed  control.  How- 
ever, many  experiment  stations  prefer  that  a  bulk  crop  always  precede  nursery  plots. 
At  the  Nebraska  station  fallow  has  failed  to  equalize  soil  conditions, 

(b )  Preparation  of  Land  for  Experimental  Crops 

All  plots  for  field  trials  should  receive  similar  treatment  except  where  the 
treatment  itself  is  under  study.  Cultural  operations  should  be  at  right  angles  to 
the  direction  of  the  plot  rows  so  far  as  practicable.  Thome  (I909)  states  that 
fertilizers  should  be  applied  by  machinery  rather  than  by  hand  methods  because  of  the 
more  uniform  distribution.  A  two-way  plow  is  useful  in  seedbed  preparation  as  a 
means  for  the  elimination  of  dead -furrows  and  back -furrows  in  the  middle  of  the  ex- 
perimental area.  Seeding  machinery  used  in  experimental  work  must  be  accurate  and, 
for  that  reason,  should  be  calibrated  wherever  possible.  Many  machines  are  unfit  for 
such  work,  A  drill  that  fails  to  drop  seed  i:.niformly  may  cause  a  serious  error  in 
field,  plot  yields.  Moreover,  it  is  very  desirable  to  have  a,  drill  that  can  bo  cleaneC 
out  readily.  Plot  rows  should  be  made  straight  because  crooked  rows  cause  irregulari- 
ty in  plot  shape. 

A  --  Methods  for  Planting  Experimental  Crops 

EI ■  Seed  P r epar at 1 on 

The  best  sued  obtainable  should  be  used  in  variety  trials,  i.e.,  pure  as  to  variety, 
free  from  weed  seeds  and  foreign  material,  high  germination,  and  uniform  in  size. 

(a)  Seed  Source 

Seed  from  entirely  different  sources  may  entirely  upset  the  small  differences 
commonly  found  in  yield  trials .  All  seed  used  in  such  trials  should  have  been  grown, 
harvested,  and  stored  under  uniform  conditions  for  at  least  two  years,  according  to 
Engledow  and  Yule  (I926) .  This  is  usually  impossible.  Under  those  conditions,  Par- 
ker (1931)  advises  "all  that  can.  be  lone  is  to  see  that  the  seed  of  the  several 
varieties  is  approximately  of  equal  germination  and.  is  equally  sound  and  healthy  in 
other  ways.1'  Adapted  seed  is  highly  desirable  for  self -fertilized  crops  and  often 
even  more  so  in  cross  fertilized  crops  like  corn. 


■258- 
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(t>)  Other  Considerations 

Unless  disease  reaction  is  under  study,  seeds  of  cereals  should  be  treated 
for  control  of  fungus  diseases  such  as  smut.  New  Improved  Ceresan,  a  dust  treatment, 
may  he  used  at  the  rate  of  one -half  ounce  per  bushel  for  the  covered  smuts.  All 
seed  should  he  of  the  same  age  when  possible.  It  should  he  weighed  out  for  the 
particular  test  on  the  same  scales,  especially  when  planted  by  weight  per  unit  area. 
The  procedure  on  many  stations  is  to  measure  out  the  seed  for  both  nursery  and  field 
plots.  For  rod-row  or  nursery  trials  the  seed  is  placed  in  coin  envelopes  and  num- 
bered to  correspond  to  the  plots.  When  a  drill  is  used,  a  little  more  than  enough 
seed  is  desirable  because  the  drill  itself  measures  the  seed  planted. 

III.  Rate  of  Seeding 

Considerable  error  may  be  introduced  in  some  crops  through  variation  in  rate  of  seed- 
ing . 

(a)  Small  Grains 

In  small  grains  the  investigator  must  either  plant  equal  weights  or  equal 
numbers  of  seeds  per  unit  area.  Up  until  1910,  the  "centgener"  method  was  extensive- 
ly used  in  small -grain  nurseries  for  the  determination  of  yields.  The  kernels  were 
space-planted  10  inches  apart  each  way  in  blocks  and  contained  100  seeds.  Aside  from 
the  theoretical  objections  in  genetics,  this  was  an  absurd  practice  from  the  stand- 
point of  field  yields  because  the  seeds  were  planted  approximately  lk   times  as  far 
apart  as  ordinarily  occur  in  a  drill -planted  field.  In  addition,  a  great  amount  of 
detailed  hand  labor  was  required.  The  method  has  been  discarded  in  this  country  in 
favor  of  the  rod -row.  (1)  Rod -Row  Trials:  The  general  procedure  in  rod  rows  is  to 
measure  the  seed  per  row.  Kiesselbach  (1923)  summarizes  the  situation  very  well. 
Fortunately,  he  states,  there  may  be  considerable  variation  in  the  rate  of  seeding 
without  material  effect  on  the  yield  per  acre.  For  instance,  Turkey  wheat  planted  at 
3,1*-,  5, 6,  and  8  pecks  per  acre  at  Nebraska  yielded  22.2,  2^.6,  23.7,  2k. h,   and  2k. 5 
bushels  per  acre,  respectively,  for  9  years.  Seed  of  average  size,  or  screened  seed, 
should  be  used  for  machine  planters.  Measurement  of  the  seed  gives  results  more  com- 
parable with  field  conditions  than  where  individual  seeds  are  space  planted  as  in 
the  centgener  method.  Seed  for  hand -planting  should  be  weighed.   (2)  The  "Checker- 
board" Trial:  The  English  workers  use  the  "checkerboard"  to  some  extent  in  their 
variety  trials .  It  is  essentially  a  modified  centgener  plan  in  which  the  seeds  are 
spaced  2x6  inches  apart.  They  admit  it  differs  from  field  conditions  and,  for  this 
reason,  use  larger  "observation"  plots  to  supplement  the  checkerboard  trials.  The 
checkerboard  is  precise  but  requires  too  much  time  and  labor  where  many  varieties  are 
under  test . 

(b)  Other  Crops 

Corn  is  generally  planted  by  farmers  in  rows  3.0  to  3.5  feet  apart.  The  usual 
rate  is  three  plants  per  hill  for  checked  corn  or  with  single  plants  1^  inches  apart 
when  drilled  in  the  row.  Under  dryland  conditions,  the  plants  are  usually  drilled  20 
to  30  inches  between  plants  in  the  row.  This  is  the  practice  in  experimental  work 
except  that  the  seed  is  often  planted  at  double  the  required  rate,  later  thinning  the 
plants  to  the  desired  stand.  Without  this  precaution,  Kiesselbach  (1928)  points  out, 
competition  between  adjacent  rows  that  differ  materially  in  stand  may  lead  to  faulty 
results.  In  sugar  beets  the  seed  is  generally  planted  very  thick.  They  are  later 
thinned  tc  the  desired  interval  between  plants,  usually  12  inches.  Sugar  beets  are 
ordinarily  planted  in  rows  20  inches  apart . 

IV.  Methods  to  Plant  Field  Plots 

The  ordinary  grain  drill  is  often  used  to  plant  field  plots  of  3mall  grain  and  forage 
crons . 


2^0 

(a)  Calibration  of  Grain  Drills 

The  necessity  for  drill  calibration  was  shown  "by  the  work  of  Bonnet t  and 
Burkart  (1923).  The  drill  may  "be  jacked  up,  the  seed  rate  set  as  desired,  and  the 
wheels  turned  30  revolutions  at  the  rate  they  would  turn  over  in  the  field.  The 
amount  of  grain  collected  for  each  drill  should  "be  weighed.  A  mark  should  he  made 
on  the  wheel  to  facilitate  the  coujit .  It  is  only  a  matter  of  arithmetic  to  calculate 
the  rate  that  the  seed  will  be  planted. 

( b )  Use  of  the  Drill 

For  small  grain  and  forage  crops  the  different  replications  of  the  same 
variety  should  be  planted  before  the  seed  is  changed.  The  plots  may  be  staked  out  in 
advance  to  facilitate  this  procedure.  The  drill  should  be  thoroughly  cleaned  out 
between  varieties,  possibly  by  the  aid  of  an  air  bellows  to  dislodge  seed  in  the 
corners  of  the  drill  box.  Some  drills  are  made  over  so  that  the  seed  box  can  be 
tipped  f orward  on  hinges  to  empty.   In  some  experiments  where  two  kinds  of  seed  are 
planted  in  a  plot,  one  crop  may  be  drilled  in  one  direction  and  the  other  at  right 
angles  to  it,  e.g.,  nurse  crop  studies  in  alfalfa. 

V.  Methods  to  Plant  Small  Grain  Nursery  Bows 

Small  grain  nursery  plots  involve  hand  methods  after  the  seedbed  has  been  prepared. 
Eod  rows  12  inches  apart  are  generally  used.  At  some  experiment  stations  18-foot 
rows  are  planted,  being  trimmed  down  at  harvest  time  to  16  feet  for  wheat,  20  feet 
for  barley,  and  15  feet  for  oats.  This  enahles  the  investigator  to  convert  the 
yields  in  grams  per  plot  into  bushels  per  acre  by  the  use  of  a  simple  factor.  The 
rod  rows  may  be  made  by  the  use  of  a  sled  marker  with  the  runners  spaced  at  the  pro- 
per intervale,  the  ideal  type  being  horse  drawn.  The  rows  are  then  opened  with  a 
wheel -hoe  for  hand  planting.  Another  method  is  to  use  a  sugar  beet  cultivator  with 
bull  tongs  spaced  at  the  proper  intervale.  This  has  proved  to  be  very  satisfactory 
at  the  Colorado  station.  A  12-inch  furrow  drill  is  used  to  mark  out  the  rows  on  the 
Akron  Station.  The  seed,  previously  weighed  out,  is  sometimes  hand-planted  (scattered) 
in  the  row.  A  Columbia  or  planet  Jr.  planter  is  used  in  many  cases  to  plant  wheat. 
Modification  of  the  Columbia  drill  for  planting  oats  and  barley  has  been  suggested 
by  Woodward  and  Tinge.y  (1953)  as  well  as  by  Jodon  (1932).  A  rapid  method  for  plant- 
ing is  by  use  of  the  spout  drill.  This  is  very  satisfactory  for  genetic  material 
where  yield  is  not  a  factor.  The  grain  is  poured  through  the  spout,  all  seed  in  the 
packet  being  planted  in  the  row  length.  After  a  little  experience  the  seed  can  be 
planted  very  unif orm."i-y .  One  man  pushes  the  drill  while  another  drops  the  seed.  The 
spout-drill  may  be  used  for  space -pi anting  after  a  little  practice.  One  station  that 
uses  3-row  plots  for  nursery  studies  has  a  horse-drawn  planter.  A  convenient  method 
for  space -planting  small  grains  at  definite  intervals,  for  example  two  inches,  is 
to  take  a  6-inch  board  and  bore  holes  at  the  proper  intervals.  The  seeds  are  dibbled 
in  these  holes . 

VI .  Methods  to  Plant  ."Row  Crops 

Corn  will  be  taken  as  an  example  of  a  row  crop.  Generally  a  horsu  or  hand -drawn 
marker  is  used  to  mark  the  distances  between  rows.  When  the  corn  is  to  be  •check- 
planted  in  hills  the  plots  are  cross -marked  to  give  a  set  of  squares,  the  intersec- 
tions designating  the  hill  locations.  Hills  are  generally  spaced  y.O   or  3*5  feet 
apart  in  all  directions.  Suitable  alleys  should  be  left  between  blocks  to  facilitate 
cultural  and  harvest  operations.  The  stakes  are  distributed  along  one  end  of  the 
plots.  The  seed  sacks  or  envelopes,  with  the  variety  number  on  them,  are  distributed 
to  correspond  with  the  stakes.  The  numbers  should  be  checked  against  the  planting 
plan  to  avoid  mistakes.  The  seed  sacks  may  be  re -distributee  for  each  replicate. 
Corn  is  generally  planted  with  a  hand  planter  in  yield  trials.   One  of  the  most 
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satisfactory  planters  is  a  made-over  potato  planter.*  It  is  constructed  to  have  a 
long,  full-length  tin  sleeve  into  which  the  proper  number  of  kernels  is  dropped  into 
the  shoe.  A  nail  sack  is  convenient  for  carrying  the  seed.  For  planting  six  kernels 
per  hill,  in  order  to  thin  to  three  plants  later,  it  is  convenient  to  plant  three 
kernels  each  in  two  jabs  about  one  inch  apart.  This  facilitates  thinning. 

B  —  Field  Observations  and  Care 

VII.  Value  of  Field  Observations 

Intimate  knowledge  of  experimental  plots  is  extremely  desirable.   In  fact,  observa- 
tions during  the  growing  period  of  crop  may  be  as  valuable  as  the  yield  data.  Dif- 
ferences due  to  disease,  irregular  loss  of  plants,  etc.,  may  account  for  the  varia*-. 
tion  in  yield.  Plot  observations  should  be  made  at  regular  intervals.  Notes  should 
be  entered  in  the  field  book  at  once  while  clear  and  vivid  in  the  mind.  Word  descrip- 
tions should  be  clear  and  precise,  being  made  as  comparisons  in  terms  of  the  check 
when  possible.  Sometimes  sketches,  diagrams  or  photograx^hs  are  a  better  method  of 
expression  than  word  descriptions. 

VIII.  Measurement  of  Plant  Characters 

Field  counts  or  measurements  on  certain  plant  characters  given  in  numbers  or  cate- 
gories make  excellent  comparative  field  records.  Formal  "score  cards"  are  apt  to 
make  observations  perfunctory.  Hence,  records  should  depend  upon  the  particular  crop 
and  the  needs  that  may  arise.  Some  of  the  more  important  characteristics  usually 
recorded  are  as  follows:  date  emerged,  stand,  winter  survival,  date  ripe,  plant 
height,  lodging,  barren  stalks  (in  com),  disease  infection,  etc.  Some  of  these  may 
be  taken  in  quantitative  measures  while  categories  are  required  in  other  cases.  When 
actual  counts  are  out  of  the  question,  a  scale  of  metrics  may  be  employed  to  convey  the 
relative  intensity  of  attack  of  a  disease  or  insect  pest.  The  numbers  1,2,3,  and  k 
may  be  used  to  represent,  respectively,  a  slight,  moderate,  bad,  or  very  severe 
attack  of  rust,  mildew,  etc.  A  scale  of  1  to  k   is  generally  adequate  for  categorical 
data.  Further  sub-division  merely  leads  to  confusion.  A  very  good  rust  scale  is 
available  in  the  agronomic  field  book  used  by  the  Division  of  Cereal  Crops  and 
Diseases,  U.S.D.A.  Yates  ( 193*0  reports  a  bias  between  different  observers  when  a 
large  number  of  counts  were  made  on  wheat  culms.  The  bias  differed  from  observer  to 
observer  and  from  sample  to  sample.  The  same  individual  should  make  all  counts  or 
at  least  all  counts  on  a  single  replication  in  order  to  avoid  this  form  of  systematic 
error . 

IX.  Stand  Counts  and  Estimates 

In  certain  crops  stand  counts  are  valuable,  but  this  depends  largely  upon  the  experi- 
ment. In  forage  experiments  the  counts  are  often  made  by  the  use  of  square  yard  or 
meter  quadrats.  These  may  be  permanent  quadrats  in  perennial  crop  studies.  In  the 
case  of  winter  or  spring  survival  counts  in  winter  wheat,  the  stand  percentage  is 
usually  estimated  except  in  special  tests.  One  person  should  make  all  the  estimates 
due  to  the  large  personal  error  invariably  introduced  when  more  than  one  person  makes 
them.  Estimation  in  categories  such  as  good,  fair,  and  poor  stands  may  be  satis- 
factory.  In  plant -survival  studies,  as  in  winter  wheat,  a  more  precise  method  would 
be  to  space  plant  the  seed  in  rod  rows  at  2 -inch  intervals.  However  spaced  plants 
have  been  observed  to  kill  worse  than  seeded  material.  Such  tests  are  valueless  for 
yield. 


*Note':  The  type  used  at  the  Nebraska,  Minnesota,  and  Colorado  stations  is  the  "Acme 
Segment"  potato  planter  manufactured  by  the  Potato  Implement  Co.,  Traverse  City, 
Mich.  It  can  be  slightly  modified  to  make  an  excellent  planter. 
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X.  Date  Headed 

There  is  considerable  variation  among  workers  as  to  the  date  when  a  crop  should  he 
considered  in  head.  In  small  grains,  date  in  head  is  usually  a  more  reliable  index 
of  earliness  or  lateness  than  date  ripe.  This  is  particularly  true  under  dryland 
conditions  where  winds  may  prematurely  dry  up  a  variety  rather  than  to  allow  it  to 
ripen  normally.  In  wheat,  oats,  and  "barley,  some  investigators  take  notes  on  first 
heading,  i.e.,  when  10  per  cent  of  the  heads  are  out  of  the  hoot.  A  plot  is  consider- 
ed fully  headed  out  "by  some  workers  when  75  P®r  cent  of  the  plants  in  the  plot  are 
in  full  head.   Others  use  a  standard  as  follows:   (1)  Oats,  when  the  heads  are  half 
out  of  the  hoot;   (2)  Barley,  when  the  beards  are  out  of  the  boot;  and  (3)  Wheat, 
when  the  heads  show  out  of  the  boot.  Date  in  silk  or  date  in  tassel  are  common  . 
notes  in  com,  date  of  silking  being  regarded  as  a  more  reliable  index  of  relative 
maturity  than  date  of  tasseling.   It  is  usual  to  determine  the  silking  date  and  con- 
vert the  data  to  the  number  of  days  from  planting  to  one-half  silking.  The  plots 
should  be  gone  over  at  intervals  of  one  or  two  days  when  date  in  head  and  similar 
notes  are  taken  because  some  dates  nay  have  to  be  moved  up  and  others  back. 

XI .  Per  cent  Lodged 

Data  on  the  differential  lodging  of  small  grains  is  desirable  as  a  measure  of  stiff- 
ness of  straw.  Sometimes  after  heavy  rains  or  irrigations  the  soil  may  be  loosened 
so  that  the  entire  plant  falls  over.  This  is  not  true  lodging.  A  plant  has  an 
inherently  weak  straw  when  it  bends  or  breaks  over.   It  is  often  difficult  to  arrive 
at  inherent  differences  because  of  soil  heterogeneity  and  its  influences.  A  variety 
should  be  considered  lodged  when  the  straw  leans  an  angle  of  h*)   degrees  or  more  be- 
cause, for  practical  purposes,  grain  lodged  to  such  an  extent  is  difficult  to  harvest. 
The  per  cent  of  grain  so  lodged  is  usually  estimated  regardless  of  the  cause.  Plow- 
ever,  Straw  weakness  can  be  detected  before  the  plants  lean  to  an  angle  of  h^   degrees. 
Some  investigators  make  notations  as  to  whether  the  straw  is  apparently  weak,  medium, 
or  strong,  and  denote  the  condition  categorically  by  V,  M,  or  S.  Under  irrigated 
conditions,  small  grains  may  bo  irrigated  heavily  after  heading  to  induce  lodging. 
In  corn,  the  relative  resistance  to  lodging  is  often  reported  as  the  percentage  of 
plants  erect  at  harvest.  The  percentages  may  be  computed  from  counts  of  the  numbers 
of  plants  erect.   In  the  interest  of  uniformity  a.  plant  should  be  considered  erect 
when  it  has  not  leaned  more  than  30  degrees  from  the  vertical  and  which  does  not 
have  the  stalk  broken  below  the  ear.  For  those  who  wish  to  take  more  detailed  records 
on  lodged  plants,  it  is  suggested  that  such  plants  be  separated  into  those  lodged 
because  of  weak  roots  (leaning  and  down  plants),  and  those  lodged  because  of  weak 
culms  (plants  broken  below  the  ear) . 

XI1-  Plant  Height 

Two  men  are  required  to  take  plant  height  notes  readily,  one  to  make  the  measurements 
and  the  other  to  record  the  results.   In  the  case  of  small  grains  such  measurements 
are ■ generally  made  just  before  harvest.  Sometimes  one  measurement  is  taken  per  plot 
while,  at  other  times,  several  plants  are  measured  at  random.  One  measurement  per 
plot  is  enough  when  the  heights  are  uniform.  A  convenient  rule  is  a  1  x  1-inch  stick 
marked,  at  one-inch  intervals  to  60  inches.  Height  notes  in  corn  are  often  taken  in 
the  fall,  but  can  be  taken  almost  any  time  after  the  plants  have  tasselled  out  fully. 
It  can  be  accomplished  with  an  ordinary  rule  about  12  feet  in  length,  2.5  inches  wide, 
and  marked  at  3 -inch  intervals. 

XIII.  Soguing  Plots 

Small  grain  plots  should  be  thoroughly  rogued  for  admixtures  before  harvest.  The 
plots  should  be  gone  over  several  times,  particularly  when  the  plants  begin  to  head 
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or  ripen.  Rogues  are  most  conspicuous  at  such  times.  It  is  difficult  to  rogue  "bar- 
ley out  of  oats  "because  the  oat  plants  are  generally  taller  than  barley  plants.  Care- 
ful work  is  required  to  rogue  off -varieties  and  off -types  within  a  crop.  These  can- 
be  detected  most  readily  by  observed  differences  in  culm  height,  date  of  heading, 
color  of  leaves,  date  ripe,  and  whether  or  not  awns  are  present.  It  is  a  safe  rule 
to  pull  all  plants  that  fail  to  conform  to  the  majority  of  the  plants  in  a  plot. 

XIV.  Date  Ripe 

The  date  on  which  a  crop  ripens  is  important,  particularly  in  small  grains  where 
earliness  is  often  a  desirable  feature.  Some  of  the  criteria  used  are  given  below. 

(a)  Wheat 

•  The  grain  may  "be  considered  ripe  when  it  is  hard  in  the  morning.  The  straw 
color  is  not  always  a  reliable  criterion  of  ripeness..  Those  who  use  straw  color  as 
a  criterion  generally  consider  the  grain  ripe  when  the  first  nodes  "below  the  heads 
on  the  main  culms  have  turned  brown. 

(b)  Other  Crops 

In  oats  the  plot  is  usually  considered  ripe  after  practically  all  of  the 
heads  have  turned  yellow.  The  barley  crop  is  generally  considered  ripe  when  all  . 
green  has  disappeared  from  the  heads.  It  is  difficult  to  estimate  date  ripe  on  small 
grain  that  is  badly  rusted  or  lodged  as  it  tends  to  ripen  unevenly  and  often  prema- 
turely in  the  case  of  rust.   In  corn,  date  in  silk  is  usually  regarded  as  a  more 
reliable  index  of  maturity  than  ripening  data  in  the  fall. 

C  --  Methods  of  Harvesting  Experimental  Crop3 

XV.  Difficulties  in  Harvesting 

The  time  of  harvesting  crops  often  presents  difficulties.  Parker  (1931)  mentions 
that  one  might  question  the  fairness  when  an  early  small  grain  variety  is  compared 
with  a  check  variety  that  may  ripen  10  days  or  more  later.  As  a  rule,  plots  are 
harvested  as  the  varieties  ripen,  particularly,  where  there  are  wide  differences  in 
time  of  ripening.  In  some  parts  of  the  country,  the  investigator  may  be  able  tc 
wait  until  the  latest  varieties  are  ripe  so  that  the  entire  field  may  be  harvested 
at  once.  Except  for  extreme  differences  in  time  of  ripening,  it  is  usually  possible 
to  allow  the  early  varieties  to  stand  without  particular  damage  to  them.   It  may  be 
desirable  in  some  instances  to  carry  out  two  separate  trials,  grouping  the  early 
varieties  in  one  and  the  late  ones  in  the  other.   In  the  case  of  root  and  tuber 
crops,  all  varieties  may  be  left  in  the  ground  and  harvested  at  the  same  time  without 
serious  consequences.  'The  problem  in  corn  is  rather  simple  because  all  varieties  are 
left  in  the  field  after  becoming  ripe  so  as  to  dry  out.  In  forage  experiments,  in- 
clement weather  may  interfere  with  the-  curing  process  and  require  that  the  hay  be 
turned  several  times.  As  a  result,  it  may  dry  out  unevenly  or  the  leaves  shatter. 
A  possible  error  in  weight  might  result. 

XVI.  Methods  of  Harvesting  Field  Plots 

The  use  of  farm  machinery  is  often  anticipated  for  large  field  plots. 

(a)  Small  Grain  Plots 

Small  grain  field  plots  should  be  gone  over  carefully  before  harvest  to  be 
certain  that  there  are  no  errors  due  to  defective  drilling,  rodent,  or  other  injury 
that  might  'influence  the  yields.  When  small  grains  are  badly  lodged,  it  may  bo. 
necessary  to  separate  the  varieties  along  the  margins  and  push  them  over  into  their 
respective  plots  before  harvest.  Kiesselbach  (1928)  uses  a  binder  equipped  with  an 
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engine  that  operates  the  working  parts .  At  the  end  of  the  plot,  the  horses  are 
stopped  hut  the  engine  continues  to  operate  and  clean  cut  the  hinder.  In  the  ab- 
sence of  the  engine  it  is  necessary  to  crank  the  platform  and  elevator  canvasses  by 
hand.  The  small  grain  shocks  should  he  placed  well  within  the  plot  to  prevent  chance 
mixtures  with  adjacent  plots  should  wind  scatter  some  of  the  bundles.  At  some  sta- 
tions the  bundles  are  shocked  on  alternate  ends  of  the  plots.  The  shocks  may  he 
tied  with  binder  twine  to  minimize  the  risk.  When  birds  are  numerous,  shock  covers 
should  be  provided.  They  can  be  made  by  sewing  together  ordinary  burlap  feed  hags. 

(b)  Corn  Yield  Tests 

The  entire  plot  can  be  harvested  without  appreciable  error  when  the  plant 
stand  is  90  per  cent  or  better.  Otherwise,  it  is  advisable  to  reject  at  harvest  all 
hills  with  less  than  the  normal  stand,  and  calculate  yields  on  a  perfect-stand  basis. 
Usually  the  imperfect -stand  hills  are  cut  with  a  corn  knife  and  removed  from  the  plot. 
A  record  is  then  made  of  the  number  of  perfect -stand  bills  that  remain.  Sometimes 
counts  on  barren  stalks,  2-eared  stalks,  suckers,  smutted  plants,  and  lodged  plants 
are  made  at  this  time.  For  small  yield  trials,  actual  harvesting  can  be  done  con- 
veniently with  an  apple-picking  ba.g.  For  large  field  plots,  Kiesselbach  (1928)  uses 
a  wagon  with  a  flat  rack  with  partitions  built  on  it.  A  partition  may  be  placed 
lengthwise,  through  the  center  and  each  divided,  for  instance,  into  three  partitions 
where  three  center  rows  are  harvested  for  yield  (as  in  p-row  plots  with  border  rows 
discarded) .  This  allows  a  separate  compartment  for  each  row.  Three  men  can  husk, 
one  man  being  on  each  yield  row.  The  compartments  on  the  other  side  can  ho  used  for 
the  next  plot  on  the  return.  At  the  end  of  the  field,  the  corn  from  each  plot  is 
sacked  and  tagged.  Field  weights  of  ear  corn  are  sometimes  taken.  The  corn  sacks 
may  then  he  either  piled  in  small  piles  in  a  shed  until  air  dry,  or  they  may  be  tied 
up  on  wires  in  a  drying  shed  (Colorado  method).  The  latter  seems  to  allow  the  corn  to 
dry  out  mere  evenly  and  more  quickly.  Some  stations  now  have  elaborate  drying  equip- 
ment where  the  entire  plot  yield,  can  he  dried  to  a  moisture-free  basis  in  a  relative- 
ly short  t  ime , 

(  c )  Forage .  Exper  iiaent  g 

Forage  plots  for  hay  are  almost  always  cut  with  a  mower  when  l/'i-O-acre  in 
size  or  larger.  The  plots  may  he  trimmed  evenly  on  the  ends  before  the  regular 
cutting  time.  The  material  is  then  raked  and  removed.  Borders  between  plots  are 
generally  disregarded  for  large  field  plots.   It  is  an  advantage  to  be  able  to  start 
on  one  side  of  the  field  and.  mow  through  all  the  series,  thus  lessening  the  number  of 
turns.  A  man  should  follow  the  mower  with  a  fork  to  be  sure  that  hay  is  not  carried 
through  the  alley  from  one  plot  to  the  next.  After  the  hay  has  been  dried  sufficient- 
ly, a  side -de livery  rake  may  be  used  to  put  it  in  windows,  after  which  it  may  be 
bunched  by  a  dump-rake  or  by  hand.  A  convenient  method  to  handle;  the  hay  from  each 
plot  is  to  put  it  on  a  wagon  or  truck  on  which  a  sling  has  been  placed.  The  load  is 
then  weighed,  the  net  weight  determined,  and  the  hay  unloaded,  from  the  truck  by  a 
cable  stacker.   A  small  composite  sample  may  he  taken  to  dry  to  an  air-dry  basis,  or 
it  may  he  ground  for  an  immediate  moisture  determination.  For  plots  l/k-O-acre  in 
size  or  smaller,  hay  nay  he  weighed  conveniently  by  a  portable  platform  scales  on 
which  a  rack  is  set.  For  plots  away  from  the  central  experiment  station,  a  tripod 
and.  spring  balance  affords  a  good  method,  to  weigh  forage  plots.  A  large  piece  of 
canvas  is  equipped  with  snaps  so  that,  when  the  hay  is  put  on  it.  the  sides  can  be 
gathered  in  and  snapped  to  a  ring.   It  Is  then  readily  hung  to  the  scale. 

(d)  Sugar  Beet  Trials 

In  sugar  beet  yield  trials,  4 -row  plots  are  generally  used  with  the  two 
center  rows  harvested  for  yield.  Except  In  studies  on  stand,  and.  certain  other  in- 
stances mentioned  previously,  the  plots  are  commonly  harvested  on  the  basis  of  com- 
petitive beets,  i.e.,  plants  surrounded  by  plants  on  all.  sides.  The  tops  of  the 
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other  "beets  (non-eompetitives)  may  be  chopped  off  with  a  hoe  "before  harvest.  The 
roots  are  then  pulled  with  a  standard  "beet  puller.  It  Is  common  practice  to  pull 
one  replication  at  a  time.  The  roots  without  tops  are  usually  weighed  in  order  to 
have  this  component  for  total  plot  yield  in  case  this  seems  to  be  needed  later.  The 
non-competitives  are  then  discarded.  Two  20-root  samples  may  be  taken  from  the  non- 
competitive beets  as  a  sugar  sample.  The  competitive  beets  are  then  pulled,  topped, 
and  weighed  for  each  plot.  The  tare  is  then  subtracted  from  the  field  weight  of  the 
roots.  When  a  washer  is  not  available,  the  tare  may  be  taken  in  the  field.  The 
sample  for  tare  is  first  weighed,  the  roots  cleaned  with  steel  brushes,  and  re- 
weighed.  The  difference  in  weight  is  the  tare.  It  is  believed  desirable  to  calcu- 
late the  tare  for  each  plot  separately. 

XVII.  Harvesting  Small  Grain  Nursery  Plots 

Competent  and  continual  supervision  is  necessary  in  the  small  grain  nursery  at  har- 
vest time.  Some  investigators  clean-cultivate  the  alleys  between  series.  Under  such 
conditions  the  rod  rows  are  generally  trimmed  down  to  remove  border  effect.  In 
wheat,  for  instance,  the  crop  is  planted  in  18-foot  rows,  one  foot  being  trimmed 
from  each  end  of  the  plot.  A  string  may  be  stretched  across  the  series  at  both  ends 
to  designate  the  discard  area  to  be  cut,  or  a  16-foot  bamboo  pole  may  be  used  on 
each  center  row  (in  3-row  plots)  so  that  tho  wheat  may  be  cut  on  both  ends  of  the 
pole.  Other  investigators  plant  the  alleys  to  some  readily  distinguishable  variety, 
thus  eliminating  the  border  effect  on  the  ends  of  the  rod  rows .  The  alleys  are  then 
removed  before  harvest.  Hand  sickles  are  used  to  cut  nursery  plots.  The  smooth- 
edged  sickle  is  most  widely  used,  but  a  saw-toothed  sickle  is  satisfactory  when  new. 
Where  straw  yields  are  taken,  grass  shears  may  be  used  to  assure  an  even  cut.  Kemp 
(1935)  has  constructed  a  rod-row  harvester  of  the  rotary  shear  type  with  which  2  men 
may  cut  1500  rows  per  day.  The  harvested  bundles  are  tied  with  binder  twine, 
usually  in  one  place.  Strings  should  be  tied  with  a  simple,  secure  knot.  The  plot 
stake  may  be  tied  into  the  bundle  or  a  tag  attached  to  the  string  with  the  plot  num- 
ber on  it.  The  bundles  may  be  tied  on  a  table.  Men  who  tie  bundles  should  tape, 
their  fingers.  Seed  plots  are  often  sacked  with  large  paper  sacks  tied" over  the 
heads  to  prevent  mixtures.  By  the  aid  of  a  large  funnel,  25  pound  manila  bags  are 
easily  placed  over  the  heads.  Sacked  bundles  should  be  put  under  cover  as  soon  as 
possible  to  protect  them  from  rain.  Small  grain  bundles  may  be  either  shocked  in 
the  field  until  they  are  ready  to  thresh,  or  hauled  to  a  shed  and  hung  up  to  dry. 
A  drying  shed  may  have  wires  about  four  feet  apart  stretched  from  one  end  to  the 
other  at  sufficient  height  so  that  the  bundles  can  be  tied  to  the  wire  with  heads 
down.  The  bundles  should  be  hung  fairly  wide  apart  when  they  are  harvested  a  little 
green.  This  is  particularly  true  for  oats. 

XVIII.  Harvest  of  Corn  .Breeding  Material 

Inbred  and  hybrid  strains  of  corn,  which  are  the  result  of  hand  pollination,  are 
usually  harvested  after  maturity.  Individual  ears  may  be  collected  in  the  bag  over 
the  ear  shoot,  and  all  sacks  from  the  same  row  tied  together  with  binder  twine,  the 
tying  being  done  with  an  ordinary  sack  needle.  These  sacks  are  then  hung  to  wires 
in  the  drying  shed  and  allowed  to  remain  there  until  air -dry.  This  method  has.  proved, 
very  satisfactory  at  the  Colorado  station. 

D  —  Threshing  and  Storage 

XIX.  Methods  of  Threshing  Field  and  Increase  Plots 

Small  grain  field  and  increase  plots  are  commonly  threshed  with  the  standard  grain 
separator.  Kiesselbach  (I928)  has  found  it  necessary  to  make  miner  modifications  to 
adapt  them  for  this  purpose.  He  lists  these  changes. as  follows:.  (1) JRemoval  of  the 
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grain  elevator;  (2)  Elimination  of  the  self-feeder;  (3)  providing  a  hinged  door  at 
the  foot  of  the  tailings  elevator  for  cleaning  out  "between  plots;  (k)   replacing  the 
grain  auger  with  a  shaker-trough  device;  (5)  removal  of  the  grain-saving  auger  in 
the  "blower ,  where"  one  exists;  (6)  equipment  with  a  high-pressure  air  pump  and  tank 
to  supply  air  pressure  through  a  hose'  to  dislodge  grains  when  the  machine  is  cleaned 
out  between  plots;  (7)  cutting  several  holes,  with  covers,  into  the  sides  of  the 
separator  at  convenient  places  to  observe  the  interior  and  to  introduce  air  pressure 
to  clean  out  the  separator.  Such  modifications  make ' it  easier  to  clean  out  the 
machine  "between  plots,  thus  reducing  the  chances  for  mixtures.  The  chances  for  mix- 
ture may  he  reduced  further  by  threshing  all  plots  of  the  same  variety  in  succession. 
Seed  can  be  saved  from  the  last  plot  of  the  variety  to  be  threshed.  It  is  important 
to  operate  the  machine  uniformly  throughout  each  experiment.  The  grain  per  plot  is 
often  weighed  on  a  platform  scale  at  the  separator. 

XX .  Threshing  Hursery  Plots 

Small  grains  3.n  yield  trials  are  generally  threshed  with  small  nursery  threshers, 
while  genetic  material  is  usually  threshed  by  hand. 

(a)  Kursery  Threshers 

Several  machines  that  can  be  cleaned  readily  have  been  devised  to  thresh 
small  nursery  plots.   According  to  Hayes  and-Garber  (1927)  "the  chief  requisites  of 
a  machine  to  be  used  for  experimental  purposes  are  as  follows:   It  should  be  easily 
cleanable  and,  in  sc  far  as  possible,  tkers  should  be  no  ledges  or  ridges  upon  which 
seeds  may  lodge.  The  alternate  threshing  of  different  nursery  crops  is  a  desirable 
procedure.  Each  of  the  plots  of  one  strain  of  wheat  may  be  threshed  separately  in 
rotation  and  then  a  strain  of  oats  may  be  threshed  in  the  same  way.  At  the  Minnesota 
Experiment  Station  winter  wheat  is  threshed  alternately  with  barley,  and  spring  wheat 
with  oats.  This  plan  helps  materially  to  reduce  the  roguing  of  accidental  mixtures 
from  the  plots."  The  Cornell  machine  designed  by  H.  W.  Teeter  is  very  satisfactory 
for  multiple-row  plots,  while  the  Kansas  machine  is  widely" Used  for  rod  rows.  The 
Cornell  machine  has  a  shaker,  screen,  and  fan.   Its  most  serious  drawback  is  the  ■ 
difficulty  in  cleaning  it  between  varieties,  however,  it  can  be  cleaned  more  readily 
than  the  Kansas  "machine.  Recently,  Vogel  and  Johnson  (1-93*0  have  developed  a  new 
type  of  rod-row  thresher  which  is  a  combination  of  an  overshot  cylinder  and  modified 
screenless  shaker  and  fan  of  an  ordinary  fanning  mill.  The  grain  is  further  cleaned 
by  a  separate  re-cleaner..  It  has  been  found  satisfactory  for  small  grains,  peas, 
flax,  and  some  grasses.  Grain  weights  are  taken -after  threshing,  usually  in  grams 
for  rod-row  plots. 

(b)  Hand  Threshing 

In  genetic  material  where  it  is  desired  to  thresh  single  plants,  threshing 
is  usually  done  by  hand.  A  threshing  board  three  feet  square  is  useful  for  this  pur- 
pose. The  frame  can  be  made  of  1  x  2 -inch  material  over  which  a  canvas  is  stretched 
tightly.  Two  blocks,  about  k   x  6  inches  in  size,  are  then  made  and  covered  on  both 
sides  with  corrugated  rubber.  These  work  very  well  for  threshing  wheat  and  other- 
naked  grains.  Eor  barley,  it  has  been  found,  at  the  Colorado  station  that  the  heads 
thresh  out  better  when  rolled  up  in  a  small  canvas  cloth  (about  9  inches  square)  and 
rubbed.  A  piece  of  tin  bent  to  form  a  fan  can  be  used  to  blow  the  chaff  out  of  the 
grain  ^   striking  it  on- the  canvas.  Coffman  (1935)  was  able  to  thresh  100  to  V)0 
single  oat  panicles  per  hour  by  the  use  of  a  light  weight  close-fitting  leather  glove 
on  the  right  hand.  The  s pikelets  aire  stripped  into  a  grain  jpan  where  the  chaff  is 
easily  blown  out. 

XXI .  Methods  for  Shelling  Corn 

After  corn  has  reached  an  air-dry  condition  it  is  ready  to  shell  for  final .determina- 
tions. Genetic  material  is  usually  shelled  by  hand,  altho  some  workers  use  an 
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enclosed  single  ear  sheller.  An  ordinary  corn  sbeller  is  very  satisfactory  for 
yield  trials.  It  should  "be  enclosed  so  that  the  kernels  are  not  scattered  when  the 
ears  are  shelled.  The  air -dry  weight  of  ear  corn -should  first  be  taken  for  the  corn 
from  each  plot.  A  platform  scale  is  often  used  for  such  weights.  It  should  be 
balanced  frequently  to  keep  it  in  adjustment.  The  corn  is  then  shelled,  the  cobs  > 
being  looked  over  minutely  to  be  sure  that  all  kernels  have  been  recovered.  The 
shelled  corn  is  then  weighed  and  recorded.  A  500-gram  shrinkage  sample  is  taken  at 
the  Nebraska  station  and  oven-dried  to  a  constant  weight.  The  yield  of  moisture-free 
corn  is  calculated  from  the  percentage  of  oven -dry  corn  in  the  shrinkage  sample.  At 
the  Colorado  station,  bushel  weight  is  taken  with  the  standard  bushel  weight  tester, 
since  bushel  weight  has  been  found  to  be  an  index  of  maturity.  Moisture  determina- 
tions are  made  with  the  Tag-Ecppenstall  moisture  meter,  one  sample  per  plot  yield. 

XXII.  Storage  of  Seed  of  Experimental  Crops 

There  are  probably  as  many  methods  for  seed  storage  of  experimental  crops  as  there  are 
experiment  stations.  The  first  requisite  is  a  place  safe  from  mice  and  insects. 
Cabinets  with  metal  drawers  probably  afford  the  best  storage.  It  is  usually  neces- 
sary to  fumigate  once  or  twice  per  year  where  grain  weevils  and  other  insects  are 
troublesome.  For  small  seed  lots  a  crystalline  compound  known  as  "Antimot"   will 
effectively  control  insects.  Small  grain  seed  is  usually  kept  in  cloth  bags,  es- 
pecially seed  saved  from  rod-row  tests.  Genetic  material  is  commonly  stored  in  coin 
envelopes.  Seed  corn  for  variety  or  yield  tests  may  be  stored  in  large  bins.  Gene- 
tic and  breeding  material  may  be  kept  either  in  cloth  bags  or  in  envelopes.  At  the 
Nebraska  station  inbred  and  hybrid  seed  corn  supplies  are  kept  in  large  clip  envelopes 
(6x9  inches  in  size).  These  are  filed  in  drawers  in  serial  order.  A  similar  plan 
:ls  followed  at  Minnesota. 
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Questions  for  Discus  a  Ion 

1.  What  precautions  are  necessary  in  a  crop  rotation  scheme  for  experimental  crops? 

2.  How  should  the  seedbed  he  prepared  for  experimental  crops? 

3.  Under  what  conditions  should  experimental  seeds  be  treated  for  disease? 

4.  What  is  the  centgouor  method?  Checkerboard  method?  Rod -row  method? 

5.  How  is  corn  generally  planted  for  experimental  purposes?  Sugar  beets? 

6.  Hew  would  you  calibrate  a  drill? 

7.  Explain  how  you  would  lay-out,  mark,  and  plant  a  wheat  nursery..  Give  all  dimen- 
sions and  processes. 

8.  Why  are  field  observations  important?  What  plant  measurements  and  notes  are 
generally  taken  on  small  grains? 

9.  What  different  methods  can  he  used  for  making  stand  counts? 

10.  At  what  time  would  you  consider  wheat,  oats,  and  barley  in  Head. ?  Pipe? 

11.  How  would  you  take  lodging  notes  in  small  grains?  Corn? 

12.  What  precautions  or  advice  should  be  given  to  your  assistants  when  rogulng  plots? 

13.  How  would  you  harvest  small  grains  in  a  test  where  the  varieties  differed  widely 
in  date  of  ripening?  Why? 

14.  How  are  large  field  plots  of  small  grains  generally  harvested?  Corn  yield  tests? 
13.  Give  the  detailed  steps  for  harvesting  sugar  beet  plots  for  yield. 

16.  Describe  a  method  for  harvesting  forage  viola  tests. 

17.  Explain  in  detail  how  you  would  harvest  snail  grain  nursery  plots. 

18.  What  modifications  on  an  ordinary  grain  separator  are  necessary  to  adapt  it  for 
threshing  field  plots  to  prevent  mixtures? 

19.  What  are  the  requisites  for  a  small  grain  nursery  thresher? 

20.  How  would  you  hand -thresh  barley  heads?  Wheat  heads?  Oat  panicles? 

Problems 

1.  It  is  desired  to  plant  wheat  in  rod  row  trials  at  the  rate  of  9C  lbs.  per  acre, 
the  rate  used  by  farmers  in  the  vicinity.  The  nursery  rows  are  i8  feet  long  and 
12  inches  apart.   Calculate  the  amount  of  seed  to  weigh  out  in  grams  for  each  row. 

2.  Suppose  the  yield  from  a  1 6-foot  rod  row  of  wheat  is  2op  grams.  Calculate  the 
yield  per  acre. 

3.  The  weight  of  shelled  corn  harvested  is  23  lbs.  on  a  plot  20  hills  long.   (a) 
When  the  hills  are  36  x  36  inches,  calculate  the  yield  per  acre  for  air  dry 
shelled  corn,   (b)  Calculate  the  yields  per  acre  on  the  basis  of  corn  with  13-J! 
per  cent  moisture  when  the  original  shelled  corn  contained  13-2  per  cent  moisture. 

4.  Make  up  a  table  of  factors  for  the  conversion  of  pounds  shelled  corn  per  plot  to 
bushels  of  shelled  corn  per  acre  when  10  to  20  hills  are  Harvested.  Suppose  the 
hills  to  he  spaced  36  x  36  inches. 
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Table  1.  -  Area  Under  the  Normal  Curve  V 


t 

A 

t 

A 

t 

A 

t 

A 

.00 

.50000 

.ko 

.65542 . 
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.01 

.50399 

.41 
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.81 
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1.21 
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.02 

.50798 

M 

.66276 

.82 

.79389 

1.22 

.88877 

.03 

.51197 

M 

.666^0 

.83 

.79673 

1.23 
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.51595+ 

.44 

.67003 

.84 
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1.24 

.89251 

•  05 
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A5 
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1.25 
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.52392 

.k6 
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.80511 

1.26 
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M 
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.87 

.80785+ 
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.81057 

1.28 
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.09 

.53586 

M 

.68793 

.39 

.31327 
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1.30 
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1.31 
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1.32 
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.53 
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.91149 
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Ta"ble  1.    -  Area  Under  the  Normal  Curve Sy  (Cont.) 
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,99653 
99664 
,  99674 
.99683 
.99693 


.  99702 

,99711 
.99720 
,99728 
,90737 


t 

A 

2.30 

.997H5- 

2.81 

•99732 

2.82 

.9976O 

2.83 

.99767 

2.34 

„99774 

2.35 

.90731 

2.86 

.99738 

2.87 

%q07Cv",. 

.99801 

d  .  09 

.99307 

2.00 

.99813 

2.91 

.90619 

2  .  92 

.99825+ 

2.93 

.  99o3,„ 

2.94 

.99830 

2.95 

.99841 

2  .06 

.99846 

O  Qf7 

,99851 

2.93 

.99856 

2.99 

.S700.L 

3.00 

. 098G04 

3.01 

. 99869 

3,02 

.09674 

3.03 

.93873 

3.04 

■.99882 

3 .  05 

.90886 

7>.o6 

. 99889 

3.07 

.99893 

3.03 

.99897 

3.09 

.99900 

3-10 

.99903 

3.ll 

.99907 

3.12 

.99910 

3.13 

.99913 

3, 14 

.99916 

3.13 

. '/99  lo 

3.16 

.99921 

3.17 

on.' .0)1 

5.13 

.99O26 

3.10 

>53 


Table  1.    -  Area  Under  the  Normal  Curve ^(Cont . ) 
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A 


A 
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3.20 

.99951 

3-40 

.99966 

3.60 

.99934 

3.3o 

.99993 

3.21 

.99931+ 

3JM 

.99967 

3.61 
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3.81 
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3.22 
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3.42 
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3.62 
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3.32 
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.99938 
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3.83 
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3.24 
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3.44 
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3.64 
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3.84' 
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5.25 
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3.35 
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3.36 
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.99952 

3.-50 
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3.77 
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3.38 
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•99984 

3.79 
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3-99 
4.00 
4.50 
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^Table  I  wae  taken  from  "Tables"  by  L.  R.  Salvosa,  published  in  "Annals  of  Mathe- 
matical Statistics",  May  1930. 
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14  16  18  21 

4.4 
4.6 

1.4816 
1.5041 
I.52&I 

4839  ^861  4884 
5063  5085  5107 
5282  5304  5326 

4907  4929  4951 
5129  5151  5173 
5347  5369  5390 

i).'974  4996  5019 
5195  5217  5239 
5412  5433  5^54 

2  57  9  11 
2  4  7  9  11 
2  4  6  9  11 

14  16  18  20 
13  15  18*20 

13  15  17  19 
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4.8 
5.0 


5.1 

5-2 
5. 3 


5.^ 
5-5 
5.6 


5-7 
5.8 

5-9 


6.0 


6,1 

6.2 

6.5 


6.14 

6.5 
6.6 


6.7 
6.8 

6.9 


o 


l . 5476 

1.5686 
l . 5892 


.  609)+ 


1.6292 
1.6487 
I.6677 


1.6864 
1.70^7 
1.7228 


1.7405 
1.7579 
1.7750 


1.7918 


I.8085 
1.8245 
1.8405 


1 


Table  IV.  Neparian  or  Hyperbolic  Logarithms!  (Cont.) 

_ 


3 


5497  5518  5559 
5707  5728  5748 
5915  5935  5953 


6ll4  6134  6154 


6312  6332  6351 
6506  6525  6544 
6696  6715  6734 


6882  6901  6919 
7066  7084  7102 
7246  7263  7281 


7422  744o  7457 
7596  7613  7630 
7766  7785  7800 


7934  7951  7967 


8099 
8262 
8421 


8116  8132 
8278  8294 
3437  8433 


1.8565  8579 

I.8718 I 87o3 

1.8871  8886 


S594  8610 
8749  876^: 
8901  8916 


1.9021 1 9036 
I.9169  9184 

1.9315 '9356 


7.0 


7.1 
7.2 

7.3 


7.4 
7-5 
7-6 


7-7 
7.8 

7.9 


3.0 


8.1 
8.2 
8.3 


1     n) 


9459 


1.9601 
1.9741 
1.9879 


9051  9066 
9199  9213 
9^44  9339 


9473  9488  950. 


9615 

9755 
98Q2 


9629  96!-:  3 
9769  9782 
9906  9920 


2.0015 
2.0149 
2.0281 


2.0412 
2.0541 
2.0669 


2.0794 


2.0919 
2.1041 

2.  ].l63 


0028 
0162 
0295 


0042  0055 
OI76  OI89 
0308  0321 


0425 
0554 
0681 


0438  0451 
0567  0580 
0694  0707 


4 


5560  5581  5602 
5769  5790  5810 

5974  5994  5oi4 


6174  6194  6214 

6371  6390  6409 
6563  6582  6601 
6752  6771  6790 

6938  6956  6974 


7120 


7133 


7299  7317 


7156 
7331)- 


7475  7492  7509 
7647  7664  7681 
7817  7834  7831 

7984  3001  8017 


8148  8165 
8310  8326 
8469  8485 


8131 
8342 
8500 


8625  8641 
8779  8795 
8931  3946 


8656 
8310 
8961 


9081  9095 
9228  9242 
9373  9387 


9110 

9237 
q4o2 


9516  9530  9544 


9657  9671 
9796  9810 

9933  99^7 


9685 
9824 
9961 


0069  0082 
0202  0215 
0334  0347 


0096 
0229 


0807  0819  0832 


0931 
1054 

1175 


0943  0956 
1066  1078 
1187  1199 


0464  0477 
0592  0605 
0719  0732 


C-490 
0618 
0744 


0844  0857  0869 


0968  0980 
1090  1102 
1211  1223 


0992 
1114 
1235 


7 


8   9 


5623  5644  5665 
5831  5851  5872 
6034  6054  6074 

6233  6253  6273 


6429  6448  6467 
6620  6639  6658 
6808  6827  6845 

6993  7011  7029 
7174  7192  7210 
7352  7370  7337 


12  3 


2  4  6 
2  4  6 
I     4  6 


4  6 


2  4  6 

2  4  6 

2  4  6 

2  4 

2  4 

2  4 


7527  7544  7561  I  2  3 
7699  7716  7733  2  3 
7867  7884  7901  2  3 


3034  8050  8066 


8197  3213 

3358 

8516 


3374 
8532 


8229  !  2 
8390  J  2 
8347  2 


1 


3__5_ 

3  5 
3  5 
3  5 


3  11  13 
3  10  12 
8  10  12 

8  10  12 


8  10  12 
8  10  11 

7  9  11 


7  9  11 

7  9  11 

7  9  11 

7  9  10 

7  9  10 

7  3  10 

7  8  10 


7 


A 


9 


15  17  19 

14  16  19 

14  16  18 

14  16  18 


14  16  1.3 
13  15  17 
13  15  17 


13  15  16 
13  14  16 
12  14  16 

12  14  16 
12  14  13 
12  13  15 


12  13  15 


6  3  10  ill  13  15 
6  8  10  |li  13  14 
6  8  9  11  13  14 


8672 
3825 
8976 

9125 
9272 
9416 


8687 
8840 
8991 

9l4o 
9286 
9430 


9539  9373 


8703  I  2 
8856  I  2 
9006  J  2 
i ™ 

9135  I  1 
9301  J  1 
9449  j  1 

9587  i  1 


i  b 
6 

!  6 


r 


3  k 
3 .4 


9699 
9838 
9974 


0109 

0242 


9713 
985I 

9988 


9727 
9865 
0001 


3  4 
-a     h 


o 
i  6 


!6 


8 
3 
3 


11  12  14 
11  12  14 
11  12  14 


9  10  12  13 
9  10  12  13 
9  10  12  13 


9 1 10  11  13 


7  8 
7  8 
7  8 


10  11  13 


10  11 
10  11 


12 

12 


012.2 
0255 


0136  !  1 
0268  !  1 


4  I.5 


>373  0386  0399  \  l 


0503 
0631 
0757 


0516 
0643 
0769 


0528  i  1 
0656  I  1 
0782  I  1 


08S2  0894  0906  1  3 


'7     8 
7     8 


l5  6  8 
5  6  3 
568 


5 


o 


8 


1005  1017  1029 
112b  1133  1150  . 
1247  1258  1270  j  1 


2  4  15  6  7 
2  4  j  5  6  7 
2     4  !  3     6     7 


9  11  12 
9  11  12 
9  10  12 


9  10  12 
9  10  12 
9  10  11 

9  10  11 


9  10  11 
9  10  11 
8  10  11 
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Table  17. 

Neparian  or  Hyperbolic  Logarit 

bms1   (Cont.) 

0 

12         3 

1+5         6 

7        8        9 

12     3 

1+ 

5     6    7 

8    9 

8.1+ 

8.5 
S.6 

2.1282 
2.11+01 
2.1518 

129"+  1306  1313 
11+12  ll+2l!-  l'!-3o 
1529  15^1  1552 

1330  13^2  1353 I1365  1377  1369 
11+1+8  11+59  1^71  j  11+33  ll+9'+  1506 
1561+  1576  1587  3.599  1610  1622 

1  2  k 
12  4 
12     3 

6     7 
6     7 
6     7 

8 

6 

O 
O 

10  11 
9  11 
9  10 

8.7 
3.8 

3.9 

2.1633 
2.171+8 
2.1861 

161+5  I656  1668 
1759  1770  1782 
1872  1833  189!+ 

1679  1691  1702 
1793  1801+  1815 
1905  1917  1923 

1713  1725  1736 
1827  1838  I3lf9 
1939  1950  1961. 

1  2  3 
12     3 

1     2    .3 

5 
5 

k 

6     7 
6    7 
6     7 

8 
8 
8 

9  10 
9  10 
0  10 

9.0 

2.1972 

I983  199^  2006 

2017  2028  2059 1 2050  2061  2072. 

1     2     3 

1+ 

6     7 

p. 

9  10 

Q.l 
9.2 
9.3 

2.2083 
2  .-2192 
2.2300 

209I+  2105  2116 
2203  2211+  2225 
2311  2322  2332 

t 

2127  2138  23J+3J2159  2170  2181 
2235  221+6  225712268  2279  22o9 
23l(3  235I+  236412375  2386  2396 

12  3 
1.2*3 

12     3 

1+ 
1+ 
1+ 

i 

5     7  1  8 
5     6  j  8 

5     6  |  7 

9  10 
9  in 
9  10 

9.* 

9-5 
9.6 

2. 21+07 
2.2513 
2.2618 

21+18  21+28  2I+39 
2523  253I+  25V+ 
2628  2638  261+9 

21+50  21+60  2 1+71  2l+3l  21+92  2502 
2555  2565  2?7St?536  2597  2607 
2659  2670  2t.80  i  269c  2701  2711 

12  3 
12  3 
1     2     3 

1+ 
1+ 
1+ 

5     6  i  7 
'5     0  |  7 
"5     67 

0   10 

O         n 

0     y 

3    9 

9.7 
9.8 

9.9 

2.2721 
2.2821+ 
2.2925 

2732  27^2  2752  2762  2773  27&3J2795  2803  231 1+ 
283I+  281+1:-  285I+  2865  2875  2385  2895  2905  2915 
2935  29I+6  2956:2966  2976  2986  2996  5006  3016 
■  1     . . -j 

1     2     3 

1     2      7- 

12      3 

1+ 
k 

h 

5     6  !  7 

5     6  i  7 
5    67 

8    0 
8    9 
0    -j 

Table  of  Neperian  Logarithms  of  10 


+11 


n 

1 

0 

3 

k    |  ■    , 

6' 

7 

•     8 

9 

loge10n 

2.3026 

1^.6052 

6.9078 

1 
9.2103     11. 5129 

13.3155 

16.1181 

18.1+207 

20.7233 

Table  of  Neperian  Logarithms  of  10 


-n 


11 

1 

2 

3 

k 

5 

6 

7 

0 
0 

9 

loge10-n 

3.6971+ 

5.391+8 

7.0922 

IO.7897 

12  .1+871 

14 .  181+5 

17 .8819 

19-5793 

21 ,276". 

^"-This  table  is  reproduced  from  "Four  Figure  Mathematical  Tables"  by  the  lace  J.  T 
Bottomley  and  published  by  Macmillan  and  Co.,  Ltd.  ( London) .  The  consent  of  the 
publishers  and  representatives  of  the  author  have  been  obtained. 

(3520-39) 
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Table  V.  Values  of  percentages  transformed  into  degrees  of  an  angle.  Angles  of 
equal  information  are  given  in  the  "body  of  the  table  corresponding  to  observed  per* 
contages  along  the  left  margin  arid  top.   (hach  angle  ending  in  5  is  followed  by  a  - 
or  a  -  sign  for  guidance  when  the  last  decimal  is  dropped) . 

Table  taken  from  article  by  Dr.  Chester  I.  Bliss  of  the  Institute  for  Plant  Protec- 
tion; Moscow,  Russia.  Reproduced  by  permission  of  the  author. 


/ 

0.00 

0 .  01 

0.02 

0 .  03 

0.Q4 

0.03 

0.06 

0.0T 

0 .00 

0 .  09 

0.0 

0 

0.57 

o.Si 

0.99 

1.15" 

1.23 

1.40 

1 .  52 

1.62 

1 .  72 

0.1 

1.81 

1.90 

1.99 

2.0T 

2  .  14 

2.22 

2  .29 

2 .  56 

2.43 

2.30 

0.2 

2.56 

2.63 

2.69 

2.75- 

2.8.1 

2.87 

2.02 

2.98 

3.03 

3.09 

0.3 

3.1^ 

3 .  19 

3.24 

3.29 

3.2* 

3.39 

3.44 

3.49 

3.93 

3.33 

0.4 

3.63 

3.67 

3.72 

3.76 

3.8c 

5.83- 

3.89 

3-93 

3.97 

4.01 

0.5 

it-. 05+ 

k .  09 

4,13 

4  .  17 

4.21 

4.25+ 

4  .20 

4 .  35 

4.3T 

4.40 

0.6 

4,44 

4.48 

4  .  52 

4  .  55+ 

4.59 

it.  62 

4 .  66 

4  .  69 

!+.73 

4 .  76 

o.T 

4.80 

4.83 

4.87 

4.90 

4  .  95 

4 .  9T 

s  no 

5.03 

5.07 

5 .  10 

0.8 

5.13 

5.16 

5.20 

r~,     ^i'/ 

5  •  2  0 

5.20 

5.32 

5.33+ 

3. 53 

5.41 

0.9 

5.44 

5  AT 

5 .  30 

5.33 

5  •  36 

3.59 

1-     ,'Vo 

5.634. 

-3 .  71 

9-^-6 

O.63 

9.81 

11 .0° 

11.24 

11.59 

12.52 

12.65 

12.79 

13.31 

13-94 

Ik,  06 

15.00 

15.12 

15.23 

lo.ll 

1.6.22 

le.32 

0.0  0.1   0.2  0.3   0.4   0.5   0.6   0.7   0.8   0.9 

1  5.74  6.02   6.29  6.^-    6.80   7.04   7.27   T.4S   7.71 

2  8.15  3.55   8.55  6.72    3.91   9.10   9.28 

3  9.98  10.14  10.51  10. 47  10.65  10.78  10.94 

4  11,54  11.68  11.85  11.97  12.11     12.25-  12.59 

5  12.92  13.05+  13.16  13.51  15.  H    1.5.56  13.69 

6  14.16  lU.30  14.42  14.54  14.6?+    14. TT  14.39 

T  15.34  15.45+  15.56  13.68  15. '79    15,89  16.00 

8  16.45  16.54  16.64  16.74  16.65--  16.95+  17.05+  17.16     17.26     17.36 

9  17.46  17.56  17.66  17.76  17. 8C^  17.93+  16.05-  18.15-  13.24     18.54 

10  13. 44  18.55  I8.63  18.72  18.81      16. 91  19.00      19.09      I9.I9      1-9.28 

11  19.37  19.46  19.55-1-  19.64  19.75      19.82  19.91     20.00     20.09     20.13 

12  20.27  20.36  20..44  20.53  20,62      20/70  20.79     20.86     20.96     21.05- 
15  21.13  21.22  21.30  21.39  21.47     21.56  21.64     21.72     21.31     21.89 

14  21.97  22.06  22.11  22.22  22.30     22.33  22.46     22.35-  22„53     22.71 

15  22 .79  22.67  22, 9e-  25,03  23.ll  23.19  23.26  23.36  23.42  23.50 
lb  23.58  25.60  25.73  23.61  23.89  23.97  24.04  24.12  24.20  dh  .27 
1.7  24.35+  24.43  24.50  24.58  24.65+  24.73  24.80  24.33  24.05-  25.05 
18  25.10  25.18  23.25+  25.33  23.40    25.48  25.55-  25.62     25. TO    25.77 

1.9  25.84  25.92  25.99  26.06  26.15    26.21  26.26    26.55-  26.42    26.49 

20  26.3c  26.64  26.71  26.78  26.85+  26.92  26.99     27.O0 

21  27.28  27.30-  27.42  27,49 

22  27,97  28.  Oil-  28.11  26.18 

23  23.66  26.73  28.79  28. 8e 

24  29.35  29A0  29.47  29.55 


27.56 

27.65 

2r"   60 

27. 76 

28.25- 

28.32 

28.45 

23.93 

29.00 

29.06 

29.13 

29.60 

29.67 

29.  T5 

29.80 

27.15 

27.20 

27.65 
28.32 

27.00 

0  P.    en 

29.20 
29.87 

29.27 

29.95 
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Table  V.  Values  of  percentages  transformed  into  degrees  of  an  angle.  Angles  of 
equal  information  are  given  in  the  "body  of  the  table  corresponding  to  observed  per- 
centages along  the  left  margin  and  top.   (Continued) 


0.0 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

0.8 

0.9 

25 

30.00 

30.07 

30.13 

30.20 

30.26 

30.33 

30.4o 

30.46 

30.53 

30.59 

26 

30.66 

30.72 

30.79 

30.85+ 

30.92 

30.98 

31.05- 

31.11 

31.18 

31.24 

27 

31.31 

31.37 

31.44 

31.50 

31.56 

31.63 

31.69 

31.76 

51.82 

31.88 

28 

31.95- 

32.01 

32.08 

32.14 

32.20 

32.27 

32.33 

32.39 

52.46 

32.52 

29 

32.58 

32.65- 

'32.71 

32.77 

32.83 

32.90 

32.96 

.33.02 

35.09 

33.15- 

30 

33-21 

33.27 

33.3^ 

33.40 

33.46 

33.52 

33.58 

33.65- 

33.71 

33.77 

31 

33.83 

33.89 

33.96 

34.02 

34.08 

34.14 

34.20 

34.27 

54.55 

34.39 

32 

-  3kM- 

34.51 

3*.  57 

34.63 

34.70 

34.76 

34.82 

34.88* 

54.94 

35.00 

33 

■    35.06 

35.12 

35.18 

35.24 

35.30 

35.37 

35.43 

35.49 

55.55- 

35.61 

34 

35.67 

35.73 

35-79 

35.85- 

35.91 

35.97. 

36.03 

36.09- 

56.15+ 

36.21 

35 

36.27 

36.33 

36.39 

36.45+ 

36.51 

36.57 

36.63 

36.69 

56.75+ 

36.81 

36 

36.87 

36.93 

36.99 

37.05- 

37.11 

37.17 

37.23 

37.29 

37.35 

37.41 

37 

57.47 

37.52 

37.58 

37.64 

37.70 

37.76 

37.82 

37.88 

37.94 

38.OO 

38 

38.06 

38.12 

38.17 

38.23 

38.29 

38.35* 

38.41 

38.47 

38.55 

38.59 

39 

38.65- 

38.70 

38.76 

38.82 

38.88 

38.94 

39.00 

39.06 

39.11 

39.17 

4o 

39.23 

39.29 

39.35- 

39.41 

39.47 

39.52 

39.58 

59.64 

39.70 

39.76 

4l 

39.82 

39.87 

39.93 

39.99 

40.05- 

40.11 

40. 16 

40.22 

4o.28 

40.54 

42 

4o.4o 

40.46 

40.51 

40.57 

40.63 

40.69 

40.74 

4o.8o 

40.86 

40.92 

43 

40.98 

41.03 

41.09 

41.15- 

41.21 

41.27 

41.32 

41.38 

41.44 

41.50 

44 

41.55+ 

4l.6l 

41.67 

41.73 

41.78 

41.84 

41.90 

41.96 

42.02 

42.07 

45 

42.15 

42.19 

42.25- 

42.30 

42.36 

42  .42 

42.48 

42.53 

42.59 

42.65- 

46 

42.71 

42.76 

42.82 

42.88 

42.94 

42.99 

43.05- 

43.ll 

43.17 

45.22 

47 

43.28 

43.34 

43.39 

43.45+ 

43.51 

43.57 

43.62 

43.68 

43.74 

45.80 

48 

43.85+ 

43.91 

43.97 

44.03 

44.08 

44.14 

44.20 

44.25+ 

44.31 

44.57 

49 

44.43 

44.48 

44.54 

44.60 

44.66 

44.71 

44.77 

44.83 

44.89 

44.94 

50 

45.00 

45.06 

45.ll 

45.17 

45.23 

45.29 

45.34 

45.40 

45-46 

45.52 

51 

45.57 

45.63 

45.69 

45.75- 

45.80 

45.86 

45.92 

45.97 

46.03 

46.09 

52 

46.15- 

46.20 

46.26 

46.32 

46.38 

46.43 

46.49 

46.55- 

46.61 

46.66 

53 

46.72 

46.78 

46.83 

46.89 

46.95+ 

47 .01 

47.06 

47.12 

47.18 

47.24 

54 

47.29 

47.35+ 

47.41 

47.47 

47.52 

47.58 

47.64 

47.70 

47.75+ 

47.81 

55 

47.87 

47.93 

47.98 

48.04 

48.10 

48.16 

48.22 

48.27 

48.55 

48.39 

56 

48.45 

48.50 

48.56 

48.62 

48.68 

48.73 

48.79 

48.85+  48.91 

48.97 

57 

49.02 

49.08 

49.14 

49.20 

49.26 

49.31 

'49.37 

49.43 

49.49 

49.54 

58 

49.60 

49.66 

49.72 

49.78 

49.84 

49.89 

49.95+ 

50.01 

50.07 

50.15 

59 

50.18 

50.24' 

50.30 

50.36 

50.42 

50.48 

50.53 

50.59 

50.65+ 

50.71 

60 

50.77 

50.83 

50.89 

50.94 

51.00 

51.06 

51.12 

51.18 

51.24 

51.50 

61 

51.35+  51.41 

51.47 

51.53 

51.59 

51.65- 

51.71- 

51.77 

51.85 

51.88 

62 

51.94 

52.00 

52.06 

52.12 

52.18 

52.24 

52.30 

52.56 

52.42 

52.48 

63 

52.53 

52.59 

52.65+ 

52.71 

52.77 

52.83 

52.89 

52.95+ 

53-01 

53.07 

64 

53.13 

53.19 

53.25- 

53.31 

53.37 

53.43 

53.49 

55.55- 

53.61 

53.67 

2€h 

Table  Y.  Values  of  percentages  transformed  into  degrees  of  an  angle.  Angles  of 
equal  information  are  given  in  the  "body  of  the  table  corresponding  to  observed  per 
cent ages  along  the  left  margin  and  top.   (Continued) 


0,0    0.1    0.2    0,5  o.k  0.5  0.6  0.7  0.8    0.9 

63  53-73  53-79  53-35-  53.91  53-97  3+.03  5+.09  54.15+  5^.21  ^+.27 

66  5+-33  5^.39  54, 1*5+ -54,51  5J+-57  54.63  54.70  5^.76  54.82  54.88 

67  5]^9^  55.00  53.06  55.12  55-13  55. 24  35-30  35-37  55  A3  53 -+9 

68  55-55+  55.61  55.67  55.73  55.80  55.86  55-92  55.98  56. 04  56.ll 

69  56.17  56.23  56.29  56.35+  56.  1*2  56A3  56.34  56.60  56.66  56.73 

70  56.79  56.83+  56.91  56.98  57. 04  57-10  57.17  57.23  57.29  57,35+ 

71  57  M  57.1*8  37.5^  57-61  57.67  57-75  57.30  57.86  57.92  57-99 

72  58.05+  53.12  58.18  58.2)4-  58.31  58.37  58  ,1*1*  58.30  5Q.56  58.63 

73  58.69  58.76  58.82  58.89  58.95+  59.02  59.08  59.15-  59-21  59-28 

74  59*3+  59- +1  39.47  39. 5J+  39.60  59.67  59.  '~(k  59.80  59.87  39-93 

75  60.00  60.07  60.13  '60.20  60.27  '60.33  60A0  60A7  60.53  60.60 

76  60.67  60.73  60.80  60.87  60.9+  6.1.00  61.07  61.1)4  61.21  61.27 

77  61.3)1-  61.1*1  61. 1*8  61.55-  61.62  61.68  61.75+61.82  61.89  61.96 

78  62.03  62.10  &2.17  62.21*  62.31  62.37  62.1*)+  62.51  62.58  62.65+ 

79  62.72  62.80  62.87  62.9)4  63.01  65.08  63.15-  63.22  63.29  63.36 

80  63.1*1*  63.51  63.38  63.65+  63.72  63.79  63.37  63.94  6)*.  01  61*.  08 

81  64.16  64. 23  6'+. 30  64.38  64.45+  64.52  64.60  6)+. 67  64.75-  64.82 

82  64.90  bk. 97  65.05-  65.I.2  63.20  65.27  65.35-  65.1*2  65.50  65.57 

83  65.65-  65.73  65,80  65. 83  65.96  66.03  66.11  66.19  66.27  66.3)* 
81*  66.42  66.50  66.58  66.66  66.74  66. 81  66.89  66.97  67.05+  67.13 

85  67.21  67.29  67.37  67,45+  67.5'^  67.62  67.70  07.78  67.86  67.9+ 

86  68.03  63.11  68.19  68.28  68.36  68.44  68,53  68,61  68.70  68.78 

87  68,87  68.95+  09.04  69.12  69.2.1  69.30  69.38  69,1*7  69.56  69.64 

88  69.73  69.82  69.91  70.00  70.09  70.18  70.27  70.36  70.45  70.54 

89  70.63  70.72  70.81  70.91  71.00  71.09  71.19  71.28  71.37  71.47 

90  71.53  71.66  71.76  71.85+  71.95+  72.03-  72.15-  72.2)+  -  72.31*  72.1*1* 

91  72.54  72.61*  72.74  72.84  72.95-  73.03-  75.15+  73.26  73o6  73,1*6 

92  73.57  73-68  73.78  73.89  74.OO  74.11  7J*.2l  74.32  'jkM  7]'-.55- 

93  74.66  7)*-77  7^-88  75.00  75 .11  75.23  75-35-  7:3.1+6  75.53  73-70 
94.  75.82  75.9)*  76.06  76.19  76.31  76.1*1*  76.56  76.69  76.32  76.95- 

95  77.08  77.21  77.31*  77,1*8  77.61  77.75+  77.89  78.03  78.17  73.32 

96  73.1*6  78.61  78.76  78.91  79.06  79.22  79.37  79-55  79.69  79.36 

97  80.02  30.19  80.37  80.5)4  30.72  80.90  81.09  81.23  81J+7  81.67 

98  81.87  82.08  82.29  82.51  82.73  82.96  83.20  83.1*5+  83.71  85.98 
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Table  V.  Values  of  percentages  transformed  into  degrees  of  an  angle.  Angles  of 
equal  information  are  given  in  the  "body  of  the  table  corresponding  to  observed  per- 
centages along  the  left  margin  and  top.   (Continued) 


0.00 

0.01 

0.02 

0.03 

0.04 

0.05 

0.06 

0.07 

0.08 

0.09 

99-0 
99.1 
99-2 

99-5 
99-4 

84.26 

84.56 
84.87 

85.20 

85.56 

84.29 

84.59 
84.90 
85.24 
85.60 

84.32 
84.62 
84.93 
85.27 
85.63 

84.35- 

84.65- 

84.97 

85.31 

85.67 

84.38 

84.68 

85.00 

85.34 
85.71 

84.41 
84.71 
85.05 
85.58 
85.75- 

8U.44 
84.74 
85.07 

85.  41 
85.79 

84.  U7 

84.77 
85.IO 
85.45- 
85.85 

84.50 
84.80 

85.13 
85.48 

85.87 

84.55 
84.84 

85.17 
35.52 
85.91 

99-5 
99.6 
99-7 
99.8 

99-9 
100.0 

85.95- 

86.57 

86.86 
87.44 
88.19 

90.00 

85.99 
86.42 

86. 91 
87.50 
88.28 

86.03 
86.47 
86.97 
87.57 
88.38 

86.07 
86.51 
87.02 
87.64 
88.48 

86.11 
86.56 
87.08 
87.71 
88.60 

86.15 

86.61 

87.15 
87.78 
88.72 

86.20 
36.66 

87.19 
87.86 

88.85+ 

86.24 

86.71 

87.25+ 

87.95 

89.01 

86.28 

86.76 

87.31 
88.01 

89.19 

86.35 
36.81 

87.57 
38.10 
89.43 

(507-58) 
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